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much theory with it (Richtel, 2016). The word accident implies randomness and 
unpredictability and luck—pure happenstance. Safety engineers know all too well that 
automobile crash risk has strong statistical relationships to many behaviors, none of 
which are random or happenstance. The engineers have in mind cases like St. Louis 
Cardinals pitcher Josh Hancock who slammed his rented SUV into a truck stopped 
on the highway with lights flashing (Vanderbilt, 2008). Calling the crash random and 


Wapresictable (en arcidepte) seems not at all right when we soaiger fhet ancocts 
(a strong risk factor), and was on a cell phone at the time of the crash (a strong risk 
factor). Oh, and he had crashed another SUV just two days before (Vanderbilt, 2008). 
Terming this an “accident” conveys a theory of randomness and unpredictability that 
does not seem right when the chosen behaviors were so wantonly reckless as in this 
case. The description of what happened is—a crash. As a theory, accident seems not 
quite right. 


Prying Variables Apart: Special Conditions 


The Goldberger pellagra example illustrates a very important lesson that can greatly 
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aid in dispelling some misconceptions about the scientific process, particularly as it 
is applied in psychology. The occurrence of any event in the world is often correlated 
with many other factors. In order to separate, to pry apart, the causal influence of 
many simultaneously occurring events, we must create situations that will never occur 
in the ordinary world. Scientific experimentation breaks apart the natural correlations 
in the world to isolate the influence of a single variable. 

Psychologists operate in exactly the same manner: by isolating variables via 
manipulation and control. For example, cognitive psychologists interested in the read- 
ing process have studied the factors that make word perception easier or more difficult. 
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Not surprisingly, they have found that longer words are more difficult to recognize 
than shorter words. At first glance, we might think that the effect of word length would 
be easy to measure: Simply create two sets of words, one long and one short, and mea- 
sure the difference in reader recognition speed between the two. Unfortunately, it is 
not that easy. Long words also tend to be less frequent in language, and frequency itself 
also affects perception. Thus, any difference between long and short words may be 
due to length, frequency, or a combination of these two effects. In order to see whether 
word length affects perception independently of frequency, researchers must construct 
special word sets in which length and frequency do not vary together. 

Similarly, Goldberger was able to make a strong inference about causation 
because he set up a special set of conditions that does not occur naturally. (Considering 
that one manipulation involved the ingestion of bodily discharges, this is putting it 
mildly!) Recall that Oskar Pfungst had to set up some special conditions for testing 
Clever Hans, including trials in which the questioner did not know the answer. Dozens 
of people who merely observed the horse answer questions under normal conditions 
(in which the questioner knew the answer) never detected how the horse was accom- 
plishing its feat. Instead, they came to the erroneous conclusion that the horse had true 
mathematical knowledge. 

Likewise, note the unusual conditions that were necessary to test the claims of 
facilitated communication. The stimuli presented to the facilitator and the child had to 
be separated in a way that neither could see the stimulus presented to the other. Such 
unusual conditions are necessary in order to test the alternative hypotheses for the 
phenomenon. 

Many classic experiments in psychology involve this logic of prying apart the nat- 
ural relationships that exist in the world so that it can be determined which variable 
is the dominant cause. Psychologist Harry Harlow’s famous experiments (Harlow 
& Suomi, 1970; Tavris, 2014) provide a case in point. Harlow wanted to test a pre- 
vailing hypothesis about infant-mother attachment: That attachment resulted from 
the mother providing the infant’s source of food. However, the problem was that, of 
course, mothers provide much more than nourishment (comfort, warmth, caressing, 
stimulation, etc.). Harlow examined the behavior of infant macaque monkeys in situ- 
ations in which he isolated only one of the variables associated with attachment by 
giving the animals choices among “artificial” mothers. For example, he found that 


the contact comfort provided by a “mother! made of terrycloth was preferred to that 
ferred a cold terrycloth mother to a warm wire one, a pe indicating that the con- 
tact comfort was more attractive than warmth. Finally, Harlow found that the infants 
preferred the terrycloth mother even when their nourishment came exclusively froma 
wire mother. Thus, the hypothesis that attachment was due solely to the nourishment 
provided by mothers was falsified. This was possible only because Harlow was able to 
pry apart variables that naturally covary in the real world. 

Creating special conditions to test for actual causal relationships is a key tool we 
can use to prevent pseudoscientific beliefs from attacking us like a virus (Stanovich, 
2004, 2009, 2011). Consider the case of therapeutic touch (TT)—a fad that swept the 
North American nursing profession in the 1990s. TT practitioners massage not the 


patient’s body but instead the patient’s so-called energy field. That is, they move their 
hands over the patient’s body but do not actually massage it. Practitioners reported 
“feeling” these energy fields. Well, you guessed it. This ability to feel “energy fields” is 
tested properly by creating exactly the type of special conditions as in the Clever Hans 
and facilitated communication claims—that is, testing whether practitioners, when 
visually blinded, could still feel whether their hands were in proximity to a human 
body. Research has demonstrated the same thing as in the Clever Hans and facilitated 
communication cases—when vision is occluded, this ability to feel at a distance is no 
greater than chance (Hines, 2003; Shermer, 2005). This example actually illustrates 
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something that was mentioned in an earlier chapter—that the logic of the true exper- 
iment is really so straightforward that a child could understand it. This is because 
one of the published experiments showing that TT is ineffective was done as a school 
science project (Dacey, 2008). 

In short, it is often necessary for scientists to create special conditions that will test 
a particular theory about a phenomenon. Merely observing the event in its natural state 
is rarely sufficient. People observed falling and moving objects for centuries without 
arriving at accurate principles and laws about motion and gravity. Truly explanatory 
laws of motion were not derived until Galileo and other scientists set up some rather 
artificial conditions for the observation of the behavior of moving objects. In Galileo’s 
time, smooth bronze balls were rarely seen rolling down smooth inclined planes. Lots 
of motion occurred in the world, but it was rarely of this type. However, it was just 
such an unnatural situation, and others like it, that led to our first truly explanatory 
laws of motion and gravity. Speaking of laws of motion, didn’t you take a little quiz at 
the beginning of this chapter? 


Intuitive Physics 


Actually, the three questions posed at the beginning of this chapter were derived 
from the work that psychologists have done on so-called “intuitive physics,” that is, 
people’s beliefs about the motion of objects. Interestingly, these beliefs are often at 
striking variance from how moving objects actually behave (Bloom & Weisberg, 2007; 
Riener et al., 2005). 

For example, in the first problem, once the string on the circling ball is cut, the 
ball will fly in a straight line at a 90-degree angle to the string (tangent to the circle). 
McCloskey (1983) found that one-third of the college students who were given this 
problem thought, incorrectly, that the ball would fly in a curved trajectory. About half 
of McCloskey’s subjects, when given problems similar to the bomber pilot example, 
thought that the bomb should be dropped directly over the target, thus displaying 
a lack of understanding of the role of an object’s initial motion in determining its 
trajectory. The bomb should actually be dropped five miles before the plane reaches the 
target. The subjects’ errors were not caused by the imaginary nature of the problem. 
When subjects were asked to walk across a room and, while moving, drop a golf ball 


TR HS HRS BANS PSUS Mavs Forward let 
aware that a bullet fired from a rifle will hit the ground at the same time as a bullet 
dropped from the same height. 

You can assess your own performance on this little quiz. Chances are that you 
missed at least one if you have not had a physics course recently. “Physics course!” you 
might protest. “Of course I haven’t had a physics class recently. This quiz is unfair!” 
But hold on a second. Why should you need a physics course? You have seen literally 
hundreds of falling objects in your lifetime. You have seen them fall under naturally 
occurring conditions. Moving objects surround you every day, and you are seeing them 
in their “real-life” state. You certainly cannot claim that you have not experienced 
moving and falling objects. Granted, you have never seen anything quite like the bul- 
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let example. But most of us have seen children let go of whirling objects, and many of 
us have seen objects fall out of planes. And besides, it seems a little lame to protest that 
you have not seen these exact situations. Given your years of experience with moving 
and falling objects, why can’t you accurately predict what will happen in a situation 
only slightly out of the ordinary? 

It is critical to understand that the layperson’s beliefs are inaccurate precisely 
because his or her observations are “natural,” rather than controlled in the manner of 
the scientist’s. Thus, if you missed a question on the little quiz at the beginning of the 
chapter, don’t feel ignorant or inadequate. Simply remember that some of the world’s 
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greatest minds observed falling objects for centuries without formulating a physics of 
motion any more accurate than that of the modern high school sophomore. 

Psychological research on intuitive physics demonstrates something of funda- 
mental importance in understanding why scientists behave as they do. Despite exten- 
sive experience with moving and falling objects, people’s intuitive theories of motion 
are remarkably inaccurate. Experience provides no inoculation against these intuitive 
errors. For example, experienced taxi drivers make most of the same speed and jour- 
ney time errors that nonprofessional divers do (Peer & Solomon, 2012). 


Intuitive Psychology 


If our intuitive (or “folk”) theories about objects in motion are inaccurate, it is hard 
to believe that our folk theories in the more complex domain of human behavior will 
be exceedingly accurate. Indeed, this research literature serves to warn us that per- 
sonal experience is no guarantee against incorrect beliefs about human psychology. 
Psychologist Dan Ariely (2015) tells the story of suffering burns over 70 percent of his 
body as the result of an accident when he was 18 years old. He describes many months 
of subsequent treatment in which bandages that were removed quickly caused him 
great pain. The theory held by the nurses was that a quick removal (which caused a 
sharp pain) was preferable to slow removal which would cause a longer—although 
less intense—pain. After leaving the hospital and beginning his career as a psychology 
student, Ariely conducted experiments to test the nurses’ belief. To his surprise, Ariely 
found that the slower procedure—lower pain intensity over a longer period—would 
have reduced the pain perception in such situations. He said that by the time he had 
finished, he realized that the nurses in the burn unit were kind and generous individu- 
als with a lot of experience in soaking and removing bandages, but that “despite all 
their experience, they erred in treating the patients they cared so much about. They 
still didn’t have the right theory about what would minimize their patients’ pain. How 
could they be so wrong, I wondered, considering their vast experience? Perhaps other 
professionals might also be misunderstanding the consequences of their behaviors 
and make poor decisions” (p. C3, Ariely, 2015). Research indicates that intuitive judg- 
ments of pain intensity in other people are quite bad, even among physicians with 
much clinical experience (Tait et al., 2009). 
“comno prackce” can OREN obscure Ihe Need for a control group to check the Verne 
ity of a conclusion derived from informal observation. For example, Dingfelder (2006) 
describes how many medical professionals believe that they should not advise individ- 
uals with Tourette syndrome (described in Chapter 2) to suppress their tics (involun- 
tary vocal expressions). The physicians believed that this caused a so-called rebound 
effect—a higher rate of tics occurring after the suppression. This belief, though, is 
based on informal observation rather than controlled experimentation. When the 
proper experimentation was done—observing the number of tics systematically by 
comparing a period of suppression to a period of nonsuppression—it appeared that 
there was no “rebound” effect at all following tic suppression. 

In Chapter 1. we illustrated that a number of commonsense (or folk) beliefs about 


human behavior are wrong, and this was just a small sample. For example, it turns 
out that there is no strong evidence indicating that highly religious people are more 
altruistic than less religious people (Paloutzian & Park, 2005). Studies have indicated 
that there is no simple relationship between degree of religiosity and the tendency to 
engage in charitable acts, to aid other people in distress, or to abstain from cheating 
other people. 

Incorrect intuitive theories are not limited to psychology. For example, they are 
rampant in the world of sport and physical fitness. For example, quantitative anal- 
yses have indicated that in football (at all levels, from high school to professional) 
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most coaches increase their probability of winning by going for it on fourth down 
when their teams are at midfield (Moskowitz & Wertheim, 2011). Similar analyses 
have shown that, overall, coaches should punt less and on-side kick more. Statistics 
prove that if coaches reoriented their strategies in these respects, they would win more 
games (Moskowitz & Wertheim, 2011). Now, coaches might have a variety of reasons 
for ignoring this statistical advice (fear of being second-guessed, for example), but 
these reasons do not apply to the fans. Nevertheless, fans have the incorrect intuitive 
theory that the coaches are right. 

Incorrect beliefs about human behavior can have very practical consequences. 
Keith and Beins (2008) mention that among their students, typical views about cell 
phones and driving are captured by statements such as “Talking doesn’t impair my 
driving” and “I talk on the phone to keep myself from falling asleep.” The students 
seem completely oblivious to the fact that driving while using a cell phone (even a 
hands-free phone) seriously impairs concentration and attention (Kunar et al., 2008; 
Richtel, 2014; Strayer et al., 2016; Strayer & Drews, 2007) and is a cause of accidents 
and deaths (McEvoy et al., 2005; Novotny, 2009; Parker-Pope, 2009; Richtel, 2014). It is 
just as dangerous as drunk driving. Texting while driving is particularly lethal. 

The list of popular beliefs that are incorrect is long. For example, many people 
believe that a full moon affects human behavior. It doesn’t (Univ. of California, 2013; 
Foster & Roenneberg, 2008). Some people believe that “opposites attract.” They don’t 
(Youyou et al., 2017). Some people believe that you shouldn’t change an answer on 
a multiple choice test. They’re wrong (Kruger et al., 2005). Some people believe that 
“familiarity breeds contempt.” It doesn’t (Claypool et al., 2008; Zebrowitz et al., 2008). 
Some people believe that people behave like robots under hypnosis. They don’t 
(Lilienfeld, 2014). And the list goes on and on and on (see Lilienfeld et al., 2010). 

The many inadequacies in people’s intuitive theories of behavior illustrate why 
we need the controlled experimentation of psychology: so that we can progress 
beyond our flat-earth conceptions of human behavior to a more accurate scientific 
conceptualization. 


Summary 
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The heart of the experimental method involves manipula- 
tion and control. This is why an experiment allows stronger 
causal inferences than a correlational study. In a correla- 
tional study, the investigator simply observes whether the 
natural fluctuation in two variables displays a relationship. 
By contrast, in a true experiment the investigator manipu- 
lates the variable hypothesized to be the cause and looks 
for an effect on the variable hypothesized to be the effect 
while holding all other variables constant by control and 
randomization. This method removes the third-variable 


problem present in correlational studies. The third-variable 
problem arises because, in the natural world, many differ- 
ent things are related. The experimental method may be 
viewed as a way of prying apart these naturally occurring 
relationships. It does so because it isolates one particular 
variable (the hypothesized cause) by manipulating it and 
holding everything else constant. However, in order to 
pry apart naturally occurring relationships, scientists often 
have to create special conditions that are unknown in the 
natural world. 


Chapter 7 


“But It’s Not Real 
Life!”: The “Artificiality” 
Criticism and. 


Psychology 


Learning Objectives 


7.1 Explain how the purpose of the experiment determines its design 


7.2 Summarize the applicability of theory in areas of psychological 
research 


Having covered the basics of experimental logic in the previous two chapters, we are 
now in a position to consider some often-heard criticisms of the field of psychology. 
In particular, we will discuss at length the criticism that psychology experiments are 
useless because they are artificial and not like “real life.” 


MihyaNaturallsnitiAd mays Mecessatytcsm 
is invalid. As was illustrated in that chapter, the artificiality of scientific experimen- 
tation is not a weakness but actually the very thing that gives the scientific method 
its unique power to yield explanations about the nature of the world. Contrary to 
common belief, the artificiality of scientific experiments is not an accidental over- 
sight. It is intentionally sought (it’s a feature, not a bug!). Scientists deliberately set 
up conditions that are unlike those that occur naturally because this is the only way 
to separate the many inherently correlated variables that determine events in the 
world. To use a phrase from Chapter 6, scientists set up special conditions in order 
to pry variables apart. 


N oe, ee ee usa nf a A ee a ee, ee u WB BE ener: (era 
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of Snow and cholera. More often, this is not the case. The scientist must manipulate 


perat AN TASWa Ay dae th MEAN SA Pat hiho Eeken apaa 
environments, and the scientist finds it necessary to bring the phenomenon into the 
laboratory, where more precise control is possible. 

Indeed, some phenomena would be completely impossible to discover if scien- 
tists were restricted to observing “natural” conditions. Physicists probing the most 
fundamental characteristics of matter build gigantic mile-long accelerators that induce 
collisions between elementary particles. Some of the by-products of these collisions 
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are new particles that exist for less than a billionth of a second. The properties of these 
new particles, however, have implications for theories of atomic structure. Many of 
these new particles would not ordinarily exist on earth, and even if they did, there cer- 
tainly would be no chance of observing them naturally. Yet few people doubt that this 
is how physicists should conduct their research—that probing nature in unusual and 
sometimes bizarre ways is a legitimate means of coming to a deeper understanding of 
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Many psychologists who have presented experimental evidence on behav- 
ior to an audience of laypersons have heard the lament “But it’s not real life!” This 
remark reflects the belief that studying human psychology in the laboratory is some- 
how strange. This objection also contains the assumption that knowledge cannot be 
obtained unless natural conditions are studied. 

Itisnot commonly recognized that many of the techniques used by the psychologist 
that are viewed as strange by the public are in no way unique to psychology; instead, 
they are manifestations of the scientific method as applied to behavior. Restriction 
to real-life situations would prevent us from discovering many things. For example, 
biofeedback techniques are now used in a variety of areas such as migraine and ten- 
sion headache control, hypertension treatment, and relaxation training (deCharms et 
al., 2005; Maizels, 2005). These techniques developed out of research indicating that 
humans could learn partial control of their internal physiological processes if they 
could monitor the ongoing processes via visual or auditory feedback. Of course, 
because humans are not equipped to monitor their physiological functions via exter- 
nal feedback, the ability to control such processes does not become apparent except 
under special conditions. 

Consider saccadic eye movements, often a focus of researchers studying the read- 
ing process (Seidenberg, 2017). People have the impression that their eyes move 
smoothly across the page when reading, but actually they don’t. Introspection 
is not a very good guide to the reading process. The eyes are stationary most of 
the time during reading and actually move during brief, so-called saccadic move- 
ments of 20-40 milliseconds. In between saccades, the eyes are relatively stationary 
for 200-300 millisecond periods. In short, contrary to introspection, your eyes are 
stationary most of the time during reading, but they also jump 3-4 times per second. 


During the jumps you are functionally blind by the way! None of these facts can be 
discerned in detail without special conditions aid special instrumentation. 


The Random Sample Versus Random 
Assignment Confusion 


Sometimes, however, the “it’s not real life” complaint arises from a different type of 
confusion about the purposes of psychological experimentation, one that is actually 
quite understandable. Through media exposure, many people are familiar with survey 
research, particularly in the form of election and public opinion polling. There isnow a 
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gtrowing awareness OF some Of the WN portant Characterisucs OF elecuon polung. Wi par- 
ticular, the media have given more attention to the importance of a random, or repre- 


sentative, sample for the accuracy of public opinion polls. This attention has led many 
people to believe, mistakenly, that random samples and representative conditions 
are an essential requirement of all psychological investigations. Because psychologi- 
cal research seldom uses random samples of subjects, the application of the random 
sample criterion by the layperson seems to undermine most psychological investiga- 
tions and to reinforce the criticism that the research is invalid because it doesn’t reflect 
real life. Actually, it is not necessary for every psychological investigation to employ a 
random sample of participants. 
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Random sampling and random assignment (discussed in Chapter 6) are not the 
same thing. Because they both have the term “random” in them, many people come 
to think that random assignment and random sampling are the same. Actually they 
are very different concepts—similar only in that they make use of the properties of 
random number generation. But they are used for very different purposes. 

Random sampling refers to how subjects are chosen to be part of a study. As 
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for example), it refers to drawing a sample from the population in a manner that 
ensures that each member of the population has an equal chance of being chosen for 
the sample. The sample that is drawn then becomes the subject of the investigation. 
It is important to understand that the investigation could be either a correlational 
study or a true experiment. It is not a true experiment unless random assignment is 
also used. 

Random assignment is a requirement of a true experiment in which an experimen- 
tal group and a control group are formed by the experimenter. Random assignment 
is achieved when each subject is just as likely to be assigned to the control group as 
to the experimental group. This is why a randomizing device such as a coin flip (or 
more often, a specially prepared table of random numbers) is employed—because it 
displays no bias in assigning the subjects to groups. 

The best way to keep in mind that random assignment and random sampling are 
not the same thing is always to be clear that any of the four combinations can occur: 
nonrandom sampling without random assignment, nonrandom sampling with ran- 
dom assignment, random sampling without random assignment, and random sam- 
pling with random assignment. Most psychological research does not employ random 
sampling. The research involves theory testing, as we will see in the next section, and 
a convenience sample is all that is necessary. If random assignment is employed in the 
study, then it becomes a true experiment. If random assignment is not employed, then 
the study is a correlational investigation. Many studies that do use random sampling 
do not employ random assignment because they are surveys and are only looking for 
associations—that is, they are correlational investigations. 
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to a particular situation. Election polling is an example of directly applied research. 
The goal is to predict a specific behavior in a very specific setting—in this case, voting 
on election day. Here, where the nature of the application is direct, questions of the 
randomness of the sample and the representativeness of the conditions are important 
because the findings of the study are going to be applied directly. 

However, it would be a mistake to view this class of research as typical. The vast 
majority of research studies in psychology (or any other science, for that matter) are 
conducted with a very different purpose in mind. Their purpose is to advance theory. 
The findings of most research are applied only indirectly through modifications in a 


theory that, in conjunction with other scientific laws, is then applied to some practical 
problem. In short, most theory-driven research seeks to test theories of psychological 


Processes ethisch. 
Whereas in applied research the purpose of the investigation is to go from data 
directly to a real-world application, basic research focuses on theory testing. However, 
it is probably a mistake to view the basic-versus-applied distinction solely in terms of 
whether a study has practical applications, because this difference often simply boils 
down to a matter of time. Applied findings are of use immediately. Basic research 
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often gets applied at a much later time, and often after many twists and turns in the 
evolution of knowledge. 

The history of science is filled with examples of theories or findings that eventu- 
ally solved a host of real-world problems even though the scientists who developed 
the theories and/or findings did not intend to solve a specific practical problem. For 
example, a group of researchers at the University of Texas Southwestern Medical 
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inflammation of the intestines (Fackelman, 1996) similar to ulcerative colitis. The 
scientists now had an animal model of the human disease. Whether these scientists 
make any progress on arthritis (their original problem), it now looks as if they have 
made a substantial contribution to the eventual treatment of ulcerative colitis and 
Crohn’s disease. 

Such indirect connections are common in science. The drug company Pfizer 
was looking for a new heart treatment when it discovered Viagra (Gladwell, 2010). 
Developments in the abstract field of number theory led to the encryption technolo- 
gies that made e-commerce possible (Reif, 2016). 

Psychologist Walter Mischel’s (2015) famous “marshmallow study” is another 
example of early basic research leading to a range of practical applications. His 
procedure involved telling four-year-old children that they will receive a small reward 
(one marshmallow) or a larger reward (two marshmallows). The child gets the larger 
reward if, after the experimenter leaves the room, the child waits until the experimenter 
returns and does not recall the experimenter by ringing a bell. If the bell is rung before 
the experimenter returns, the child will get only the smaller reward. The dependent 
variable is the amount of time that the child waits before ringing the bell. His first 
studies using this famous delay of gratification paradigm were denied government 
research funding (Sleek, 2015), and he was told to seek funding from a candy com- 
pany! But longitudinal studies have shown that the test, when give to four-year-olds, 
predicts adult success. The ability to delay gratification in childhood predicts such 
important life outcomes as drug use, obesity levels, and SAT scores. Mischel’s (2015) 
work has been used in several important programs to develop children’s self control 
skills (Winerman, 2014). 

Thus, we must recognize that, although some research is designed to predict 
events directly in a specific environmental situation, much scientific research is basic 
research designed to test theory. Researchers who conduct applied and basic research 
have completely different answers to the question, how do these findings apply to 
real life? The former answers, “Directly, provided that there is a reasonably close rela- 
tionship between the experimental situation and the one to which the findings are to 
be applied.” Thus, questions of the random sampling of subjects and the representa- 
tiveness of the experimental situation are relevant to the applicability of the results. 
However, the investigator in a theory-testing study answers that his or her findings 
do not apply directly to real life, and that the reason for conducting the study is not to 
produce findings that would be applicable to some specific environmental situation. 
Therefore, this scientist is not concerned with questions of how similar the subjects of 


the study are to some other group or whether the experimental situation mirrors some 
real-life environment. Does this mean, then, that these findings have no implications 
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be applied to a particular problem. 

This type of indirect application through theory has become quite common in some 
areas of psychology. For example, years ago when cell phones were first introduced, 
many cognitive psychologists immediately began to worry about the implications for 
safety when people began to use them while driving automobiles. The psychologists 
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immediately expected that cell phone use would cause additional accidents—and 
not just because the phone would take a hand off the wheel. Instead, what they were 
worried about was the attentional requirements of talking on the cell phone. What is 
important to realize is that the psychologists became worried about cell phone use in 
cars long before there was a single experimental study of actual cell phone use and 
its relation to accidents (Strayer et al., 2016). The psychologists made their prediction 
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a car clearly falls within the domain of those theories, which had been established 
through voluminous experimentation (literally hundreds of laboratory studies). When 
in fact the actual studies of real cell phone use were done, they confirmed the predic- 
tion derived from psychological theories of attention: Cell phone use is indeed a cause 
of motor vehicle accidents—and hands-free phones do not solve the attentional prob- 
lem, which is the main cause of the accidents (Insurance Institute for Highway Safety, 
2005; Kunar et al., 2008; Levy et al., 2006; McEvoy et al., 2005; Richtel, 2014; Strayer & 
Drews, 2007; Strayer et al., 2016). 


Applications of Psychological Theory 


We have described how the purpose of most research is to develop theory rather than 
to predict events in a specific environment. And we have described how the findings 
of most research are applied indirectly, through theory, rather than directly in a spe- 
cific environmental situation. Given these facts, though, it is legitimate to ask how 
much application through theory has been accomplished in psychology. That is, have 
psychology’s theories been put to this test of generality? 

On this point, we must admit that the record is mixed. But it is wise to keep 
psychology’s diversity in mind here. It is true that some areas of research have made 
only modest progress along these lines. However, other areas have quite impressive 
records of experimentally derived principles of considerable explanatory and predic- 
tive power. 

Consider the basic behavioral principles of classical and operant conditioning. 
These principles and their elaborating laws were developed almost entirely from 
experimentation on nonhuman subjects, such as pigeons and rats, in highly artificial 
laboratory settings. Yet these principles have been successfully applied to a wide vari- 
ety of human problems, including the treatment of autistic children, the treatment of 
alcoholism and obesity, the management of residents in psychiatric hospitals, insom- 
nia interventions, and the treatment of phobias, to name just a few. 

The principles from which these applications were derived were identified pre- 
cisely because the laboratory experimentation allowed researchers to specify the rela- 
tionships between environmental stimuli and behavior with an accuracy not possible 
in a natural situation, in which many behavioral relationships may operate simultane- 
ously. As for the use of nonhuman subjects, in many cases, theories and laws derived 
from their performance have provided good first approximations to human behavior 


(Vazire & Gosling, 2003). When humans were examined, their behavior often followed 
laws that were very similar to those derived from other animals. This should hardly sur- 
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contributed to developments in behavioral medicine, stress reduction, psychotherapy, 
rehabilitation of injured and handicapped individuals, studying the effects of aging 
on memory, methods to help people overcome neuromuscular disorders, understand- 
ing drug effects on fetal development, traffic safety, and the treatment of chronic pain 
(Gosling, 2001; Kalat, 2007; Zimbardo, 2004). Research with monkeys has led to some 
real advances in understanding the underlying basis of phobias and anxiety disorders 
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(Mineka & Zinbarg, 2006). Nevertheless, scientists, including psychologists, studying 
animals have come under increasing attack from animal advocates, and sometimes 
those attacks have been violent. J. David Jentsch, who works in the psychology depart- 
ment at UCLA and studies the brain circuits involved in drug addiction, had his car 
bombed by animal activists in 2009 and the resulting fire almost burned down his home 
(Collier, 2014). 
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as diverse as radar monitoring, street lighting, and airplane cockpit design (Durso et al., 
2007; Wickens et al., 2012). Much is now known about the cognitive effects of aging 
(Salthouse, 2012), and this new knowledge has direct implications for efforts to design 
systems that will help people to compensate for cognitive loss (Schaie & Willis, 2010). 

Psychological studies of judgment and decision making have had implications 
for medical decision making, educational decision making, and economic decision 
making (Croskerry, 2013; Stanovich et al., 2016; Tetlock & Gardner, 2015; Thaler, 2015). 
The famous obedience to authority studies of Stanley Milgram were used in officer 
training schools of the military (Blass, 2004; Cohen, 2008). An exciting new develop- 
ment is the increasing involvement of cognitive psychologists in the legal system, in 
which problems of memory in information collection, evidence evaluation, and deci- 
sion making present opportunities to test the applicability of cognitive theories (Wells 
et al., 2015; Wixted et al., 2015). In recent decades, theory and practice in the teaching 
of reading have been affected by research in cognitive psychology (Seidenberg, 2017; 
Willingham, 2017). 

In short, psychology has been applied to “real life” in a large number of ways, but 
little of this is known to the public. Research psychologists have found ways of getting 
people to save more for their retirement and to increase their organ donations (Thaler, 
2015), discovered how to influence people to get their flu shots (Price, 2009), invented 
behavioral programs that would reduce energy use (Attari et al., 2010), discovered 
ways to facilitate onscreen reading (Chamberlin, 2010), found how to get drivers to 
increase their hazard perception (Horswill, 2016), found ways to get health personnel 
to increase their rate of hand washing (Grant & Hofmann, 2011), found ways to reduce 
health costs (Deangelis, 2010), found ways to decrease the occurrence of wrong-side 
surgery (McKinley et al., 2015; Zuger, 2015), have found out how to increase voter 
turnout (Bryan et al., 2011), and have found the answer to the age-old question of why 
children hate school (Willingham, 2010). 

These applications of psychology have become so predictable and numerous that 
governments have formed special units to facilitate the use of behavioral science to 
foster broad public goals (Appelbaum, 2015; Author, 2016; Lewis, 2017). The United 
States established the Social and Behavioral Sciences Team (SBST) in 2014, and there is 
a parallel unit in the United Kingdom, the Behavioural Insights Team (BIT). These units 
have launched numerous projects based on behavioral science. For example, the SBST 
has projects aimed at student-loan default prevention, saving more for retirement, and 
inhibiting federal employees from texting while driving. The BIT has projects focused 
on facilitating taxpaver compliance and on timelv vehicle registration behavior. 


The “College Sophomore” Problem 


The concerns of many people who question the “representativeness” of psychologi- 
cal findings focus on the subjects of the research rather than on the intricacies of the 
experimental design. We are confronting here what is sometimes called the college 
sophomore problem; that is, the worry that, because college sophomores are the subjects 
in an extremely large number of psychological investigations, the generality of the 
results is in question. Psychologists are concerned about the college sophomore issue 
because it is a real problemin certain areas of research. Nevertheless, it is important to 
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consider the problem in perspective and to understand that psychologists have several 
legitimate responses to this criticism. Here are three responses: 


1. The college sophomore criticism does not invalidate past results, but simply calls 
for more findings that will allow assessment of the theory’s generality. Adjust- 
ments in theory necessitated by contrary data from other groups can be made 
accurately only because we have the college sophomore data. The worst case, 
a failure to replicate, will mean that theories developed on the basis of college 
sophomore data are not necessarily wrong but merely incomplete. 


2. In many areas of psychology, the college sophomore issue is simply not a prob- 
lem because the processes investigated are so basic (the visual system, for exam- 
ple) that virtually no one would worry that their fundamental organization 
depends on the demographics of the subject sample. The functional organiza- 
tion of the brain and the nature of the visual systems of people in Montana tend 
to be very similar to those of people in Florida (or Argentina, for that matter). 


3. Replication of findings ensures a large degree of geographic generality and, to a 
lesser extent, generality across socioeconomic factors, family variables, and early 
educational experience. As opposed to studies conducted 75 years ago, when the 
sample of university subjects participating would have come from an extremely 
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It would be remiss, however, not to admit that the college sophomore issue is a 
real problem in certain areas of research in psychology. Nevertheless, psychologists 
are now making greater efforts to correct the problem. For example, developmental 
psychologists are almost inherently concerned about this issue. Each year hundreds of 
researchers in this area test dozens of findings and theories that were developed from 
studies of college subjects by performing the same research on subjects of different 
ages. The results from subject groups of different ages do not always replicate those 
from college students. Developmental psychology would be starkly boring if they did. 
But this sizable group of psychologists is busy building an age component into psy- 
chological theories, demonstrating the importance of this factor, and ensuring that the 
discipline will not end up with a large theoretical superstructure founded on a thin 
database derived from college students. 

Psychologists also conduct cross-cultural research in order to assess the general- 
ity of the processes uncovered by researchers working only with North American 
subgroups. There are many instances in which cross-cultural comparisons have 
shown similar trends across cultures (e.g., Demetriou et al., 2005), but there are 
others in which cross-cultural research does not replicate the trends displayed by 
American college sophomores (e.g., Buchtel & Norenzayan, 2009; Henrich et al., 
2010). However, when these discrepancies occur, they provide important information 
about the contextual dependence of theories and outcomes (Buchtel & Norenzayan, 
2009; Henrich et al., 2010). 
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of replicability. Many of the fundamental laws of information processing have been 
observed in dozens of laboratories all over the world. It is often not realized that if a 


payshologist at the University of Michigan obtains a finding of true iggportance, ini 
State, Cambridge, Yale, Toronto, and elsewhere. Through this testing, we will soon 
know whether the finding is due to the peculiarities of the Michigan subjects or the 
study’s experimental setting. 

Cognitive, social, and clinical psychologists have also studied various human 
decision making strategies. Most of the original studies in this research area were 
done in laboratories, used college students as subjects, and employed extremely 
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artificial tasks. However, the principles of decision making behavior derived from 
these studies have been observed in a variety of nonlaboratory situations, including 
the prediction of closing stock prices by bankers, actual casino betting, prediction 
of patient behavior by psychiatrists, economic markets, military intelligence analysis, 
betting on NFL football games, estimation of repair time by engineers, estimation of 
house prices by realtors, business decision making, and diagnoses by physicians—and 
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The internet also provides a way for psychology to deal with the college sopho- 
more problem (Germine et al., 2012; Maniaci & Rogge, 2014). Birnbaum (1999, 2004) 
ran a series of decision making experiments in the laboratory and by recruiting 
participants over the internet. The laboratory findings all replicated on the internet 
sample even though the latter was vastly more diverse—including 1,224 participants 
from 44 different countries. Gosling et al. (2004) studied a large internet sample of 
participants (361,703 people) and compared their performance with that of traditional 
samples in published studies. They found that the internet sample was more diverse 
with respect to gender, socioeconomic status, geographic region, and age. Importantly, 
they found that findings in many areas of psychology, such as personality theory, were 
similar on the internet when compared to traditional methods. 

The Amazon Mechanical Turk (called MTurk) has been used extensively in recent 
psychological research to test subject samples that are at least somewhat different from 
college sophomores (DeSoto, 2016; Paolacci & Chandler, 2014; Stewart et al., 2015). The 
MTurk is an online marketplace of workers who are willing to complete experimen- 
tal tasks for modest pay. The MTurk workers are considerably older than the college 
students used in most research (averaging over 30 years old), but they are atypical in 
other ways (they are less religious, underemployed, etc.). Nonetheless, many experi- 
mental effects that have been found in the lab are being tested on MTurk samples 
and are replicating with moderate frequency. Other internet sites such as Facebook 
are being used increasingly for psychological research (Kholodkov, 2013; Kosinski et 
al., 2015). These other sites also offer types of subjects very different from the typical 
college sophomore. 

Of course, not all psychological findings replicate. On the contrary, replication 
failures do happen (Gilbert et al., 2016; Open Science Collaboration, 2015). The rate of 
replication failure in psychology has been an issue of intense discussion and debate 
during the last few years (Maxwell et al., 2015), as is the question of whether the rate 
in psychology is higher than that of other disciplines. This is a difficult question to 
answer, but it seems like failures of replication are less likely to be reported in psychol- 
ogy than in the physical sciences (Fanelli, 2010), indicating that psychology still has a 
way to go in upgrading its standards. Nevertheless, failures to replicate seem as preva- 
lent in biology and medicine as they are in psychology (Author, 2013). The number 
of meta-analyses (discussed in Chapter 8) in psychology is increasing, a sign that the 
field is concerned with the consistency of its findings (Schmidt & Oh, 2016). 

Nevertheless, there is encouraging evidence that a substantial number of 
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tings (although not all do). In the most comprehensive analysis to date, Mitchell (2012) 
meta-analyzed (see Chapter 8) data on 217 lab-field comparisons from various areas of 
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findings observed in the lab and those found in the field, but there was a large variation 
between different areas of psychology. Industrial-organizational psychology showed 
the highest degree of correspondence between lab and field, but social psychology was 
much lower. In 187 of 217 of the comparisons, the lab results and field results were in 
the same direction, but in 30 cases out of 217 the results in the lab were the opposite of 


those in the field. Among 30 reversals, the majority came from social psychology. 


Chapter 7 


But how can any psychological findings be applied if replication failures some- 
times occur? How can applications be justified if knowledge and theories are not 
established with certainty, when there is not complete agreement among scientists on 
all the details? This particular worry about the application of psychological findings is 
common because people do not realize that findings and theories in other sciences are 
regularly applied before they are firmly established. Of course, Chapter 2 should have 
fae; CHA on eae Delsse we Can apply she results orscenthie vest pabons, 
then no applications would ever take place. Applied scientists in all fields do their 
best to use the most accurate information available, realizing at the same time that the 
information is fallible. 

Many nonscientists view medicine as much more scientific than psychology. But 
medicine has taken as long as psychology has to move from clinical impression to 
science-based practice (Lewis, 2017; Novella, 2015). Also, the uncertainty in the prac- 
tice of medicine is no less than that in the practice of psychology. For example, key 
treatment-related findings in medicine often fail to replicate, diagnosis is often more a 
function of the physician than the disease, and new technologies often result in over- 
treatment that does not increase cure rates (Welch et al., 2012). Medical researchers still 
debate the benefits and harms of mammography at various ages (Kolata, 2014). The 
benefits and costs of a daily baby aspirin to prevent cardiovascular disease are still 
contested (Cuzick, 2015; Marks, 2015). Knowledge in psychology is probabilistic and 
uncertain—but the same is true in most other biosocial sciences. 


The Real-Life and College Sophomore 
Problems in Perspective 


Several issues have been raised in this chapter, and it is important to be clear about 
what has, and what has not, been said. We have illustrated that the frequent complaint 
about the artificiality of psychological research arises from a basic misunderstanding 
not only of psychology but also of basic principles that govern all sciences. Artificial 
conditions are not a drawback of experimental research. They are deliberately created 
so that we can pry variables apart. 

We have also seen why people are concerned that psychologists do not use ran- 
dom samples in all their research and also why this worry is often unfounded. Finally, 
we have seen that a legitimate concern, the college sophomore problem, is sometimes 
overstated, particularly by those who are unfamiliar with the full range of activities 
and the diverse types of research that go on in psychology. The college sophomore 
problem has been an issue of great concern within psychology, and no psychologist 
is unaware of it. So, although we should not ignore the issue, we must also keep it in 
perspective. 

Nevertheless, psychologists should always be concerned that their experimental 
conclusions not rely too heavily on any one method or particular subject population. 


plagued by a college sophomore problem (Jatte, UUS; Alenrich et al., ZULU). Cross- 
cultural psychology, an antidote to the college sophomore problem, is not yet fully 
integrated with psychology as a whole (Wang, 2017). As mentioned previously, the 
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our data analysis procedures (Simmons et al., 2011). Finally, as will be discussed in 
Chapter 12, there is a growing problem in psychology that too many of its research- 
ers (particularly in universities) share preexisting biases, especially political ones 
(Duarte et al., 2015; Inbar & Lammers, 2012; Jussim et al., 2016; Lukianoff & Haidt, 


2015; Tetlock, 2012). 


Summary 


Some psychological research is applied work in which 
the goal is to relate the results of the study directly to a 
particular situation. In such applied research, in which 
the results are intended to be extrapolated directly to a 
naturalistic situation, questions of the randomness of the 
sample and the representativeness of the conditions are 
important because the findings of the study are going to 
be applied directly. However, most psychological research 
is not of this type. It is basic research designed to test 
theories of the underlying mechanisms that influence 
behavior. In most basic research, the findings are applied 
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only indirectly through modifications in a theory that will 
at some later point be applied to some practical problem. 
In basic research of this type, random sampling of subjects 
and representative situations are not an issue because the 
emphasis is on testing the universal prediction of a theory. 
In fact, artificial situations are deliberately constructed in 
theory-testing basic research because (as described in the 
previous chapter) they help to isolate the critical variable 
for study and to control extraneous variables. Thus, the 
fact that psychology experiments are “not like real life” is 
a strength rather than a weakness. 


Chapter 8 

Avoiding the 
Einstein Syndrome: 
The Importance of 
Converging Evidence 


Learning Objectives 


8.1 Compare the breakthrough and gradual-synthesis models of 
scientific progress 


8.2 Describe the principle of converging evidence in evaluating 
experiments and testing theories 


8.3 Explain how multiple research methods are used to arrive at a 
scientific consensus 


8.4 Explain why meta-analyses is used to draw conclusions in psychology 


“Biological Experiment Reveals the Key to Life,” “New Breakthrough in Mind Control,” 
“California Scientist Discovers How to Postpone Death”—as you can see, it is not dif- 
ficult to parody the “breakthrough” headlines of the media (including print media, 
television, and the internet). Because such headlines regularly come from the most irre- 
sponsible quarters of the media, it should not be surprising that most scientists recom- 
mend that they be approached with skepticism. The purpose of this chapter, though, is 
not only to warn against the spread of misinformation via exaggeration or to caution 
that the source must be considered when evaluating reports of scientific advances. In 
this chapter, we also want to develop a more complex view of the scientific process than 
was presented in earlier chapters. We shall do this by elaborating on the ideas of sys- 
tematic empiricism and public knowledge that were introduced in Chapter 1. 

The breakthrough headlines in the media obscure an understanding of psychology 


and other sciences in many ways. One particular misunderstanding that arises from 
breakthrough headlines is the implication that all problems in science are solved when 
a single, crucial experiment completely decides the issue, or that theoretical advance is 
the result of a single critical insight that overturns all previous knowledge. Such a view 
of scientific progress fits in nicely with the operation of the news media and the internet, 
in which history is tracked by presenting separate, disconnected events in bite-sized 
units. It is also a convenient format for the Hollywood entertainment industry, where 
events must have beginnings and satisfying endings that resolve ambiguity. However, 
this is a gross caricature of scientific progress and, if taken too seriously, leads to mis- 
conceptions about scientific advancement and impairs the ability to evaluate the extent 


Avoiding the Einstein Syndrome: The Importance of Converging Evidence 85 


of scientific knowledge on a given issue. In this chapter, we will discuss two principles 
of science—the connectivity principle and the principle of converging evidence—that 
describe scientific progress much more accurately than the breakthrough model. 


The Connectivity Principle 


In denying the validity of the “great-leap” or crucial-experiment model of all scien- 
tific progress, we do not wish to argue that such critical experiments and theoreti- 
cal advances never occur. On the contrary, some of the most famous examples in the 
history of science represent just such occurrences. The development of the theory of 
relativity by Albert Einstein is by far the most well known. Here, a reconceptualization 
of such fundamental concepts as space, time, and matter was achieved by a series of 
remarkable theoretical insights. 

However, the monumental nature of Einstein’s achievement has made it the domi- 
nant model of scientific progress in the public’s mind. This dominance is perpetuated 
because it fits in nicely with the implicit “script” that the media use to report most 
news events. More nonsense has been written about relativity theory than perhaps any 
other idea in all of history (no, Einstein did not prove that “everything is relative”). Of 
course, our purpose is not to deal with all of these fallacies here. There is one, however, 
that will throw light on our later discussions of theory, evaluation, in psychology. : 

The reconceptualization of ideas about the physical universe contained in 
Einstein’s theories is so fundamental that popular writing often treats it as if it were 
similar to conceptual changes in the arts (a minor poet is reevaluated and emerges 
with the status of a genius; an artistic school is declared dead). Such presentations 
ignore a basic difference between conceptual change in the arts and in the sciences. 

Conceptual change in science obeys a principle of connectivity that is absent or, 
at least, severely limited in the arts (Bronowski, 1977; Haack, 2007). That is, a new 
theory in science must make contact with previously established empirical facts. To 
be considered an advance, it must not only explain new facts but also account for old 
ones. The theory may explain old facts in a way quite different from that of a previous 
theory, but explain them it must. This requirement ensures the cumulative progress of 
science. Genuine progress does not occur unless the realm of our explanatory power 
has been widened. If anew theory accounts for some new facts but fails to account for 
a host of old ones, it will not be considered an advance over the old theories and, thus, 
will not immediately replace them. 

Despite the startling reconceptualizations in Einstein’s theories (clocks in motion 
running slower, mass increasing with velocity, etc.), they did maintain the principle of 
connectivity. In rendering Newtonian mechanics obsolete, Einstein’s theories did not 
negate or render meaningless the facts about motion on which Newton’s ideas were 
based. On the contrary, at low velocities the two theories make essentially the same pre- 
dictions. Einstein’s conceptualization is superior because it accounts for a wide variety 
of new, sometimes surprising, phenomena that Newtonian mechanics cannot accommo- 
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reconceptualizations in the history of science, maintain the principle of connectivity. 


Af ConsiwatevisyRule: Beware of Violations 


The breakthrough model of scientific progress—what we might call the Einstein 
syndrome—leads us astray by implying that new discoveries violate the principle of con- 
nectivity. This implication is dangerous because, when the principle of connectivity is 
abandoned, the main beneficiaries are the purveyors of pseudoscience and bogus theories. 
Such theories derive part of their appeal and much of their publicity from the fact that 
they are said to be startlingly new. “After all, wasn’t relativity new in its day?” is usually 
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the tactic used to justify novelty as a virtue. Of course, the data previously accumulated 
in the field that the pseudoscientists wish to enter would seem to be a major obstacle. 
Actually, however, it presents only a minor inconvenience because two powerful strategies 
are available to dispose of it. One strategy that we have already discussed (see Chapter 2) 
is to explain the previous data by making the theory unfalsifiable and, hence, useless. 

The second strategy is to dismiss previous data by declaring them irrelevant. This 
dismissal is usually accomplished by emphasizing what a radical departure the new 
theory represents. The phrases “new conception of reality” and “radical new depar- 
ture” are frequently used. Vague references to quantum theory are often thrown in to 
suggest that the new theory is deep and profound (DeBakcsy, 2014; Hassani, 2016). The 
real sleight of hand, though, occurs in the next step of the process. The new theory is 
deemed so radical that experimental evidence derived from the testing of other theo- 
ries is declared irrelevant. Only data that can be conceptualized within the framework 
of the new theory are to be considered; that is, the principle of connectivity is explicitly 
broken. Obviously, because the theory is so new, such data are said to not yet exist. 
And there you have it: a rich environment for the growth of pseudoscience. The old, 
“irrelevant” data are gone, and the new, relevant data do not exist. The scam is easily 
perpetrated because the Einstein syndrome obscures the principle of connectivity, the 
importance of which is ironically illustrated by Einstein’s theories themselves. 
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not connect with what else is known in behavioral science. Recall the discussion of 
facilitated communication in Chapter 6. It breaks the principle of connectivity because 
it would require that we overturn basic knowledge in fields as diverse as neurology, 
genetics, and cognitive psychology. This hypothesized cure shows no connectivity 
with the rest of science. 

Consider another example from psychology. Imagine two specific treatments have 
been developed to remediate the problems of children with extreme reading difficul- 
ties. No direct empirical tests of efficacy have been carried out using either treatment. 
The first, Treatment A, is a training program to facilitate the awareness of the segmen- 
tal nature of language at the phonological level. The second, Treatment B, involves 
giving children training in vestibular sensitivity by having them walk on balance 
beams while blindfolded. Even if there was no prior evidence on either treatment, one 
of them has the edge when it comes to the principle of connectivity. Treatment A makes 
contact with a broad consensus in the research literature that children with reading 
difficulties are hampered because of insufficiently developed awareness of the seg- 
mental structure of language (Hulme & Snowling, 2013; Seidenberg, 2017). Treatment 
Bis not connected to any corresponding research literature consensus. This difference 
in connectivity dictates that Treatment A is a better choice. 

Neurologist Steven Novella (2015) makes the same point about complementary 
and alternative medicine. Saying that complementary and alternative medicine lacks 
empirical evidence—which it does (Dorlo et al., 2015; Mielczarek & Engler, 2013; Swan 
etal., 2015)—is in one sense too generous. Novella (2015) points at the fact that most of 
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these remedies do not deserve to have experimental tests conducted on them—because 
they display no connectivity with the rest of science. 


The “Great-Leap” Model Versus 
the Gradual-Synthesis Model 


The tendency to view the Einsteinian revolution as typical of what science is tempts 
us to think that all scientific advances occur in giant leaps. The problem is that people 
tend to generalize such examples into a view of the way all scientific progress should 
take place. In fact, many areas in science have advanced not by single, sudden break- 
throughs but by series of fits and starts that are less easy to characterize. 
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There is a degree of fuzziness in the scientific endeavor that most of the public 
is unaware of. Experiments rarely completely decide a given issue, supporting one 
theory and ruling out all others. New theories are rarely clearly superior to all previ- 
ously existing competing conceptualizations. Issues are most often decided not by a 
critical experiment, as movies about science imply, but when the community of scien- 
tists gradually begins to agree that the preponderance of evidence supports one theory 
rather than another. The evidence that scientists evaluate is not the data froma single 
experiment that has finally been designed in the perfect way. Instead, scientists most 
often must evaluate data from literally dozens of experiments, each containing some 
flaws but each providing a small part of the answer. This alternative model of scientific 
progress has been obscured because the Einstein syndrome creates in the public a ten- 
dency to think of all science by reference to physics, to which the great-leap model of 
scientific progress is perhaps most applicable. 

Consider the rapid advances in genetics and molecular biology that have 
occurred in the last hundred years. These advances have occurred not because one 
giant, Einstein, came onto the scene at the key moment to set everything straight. 
Instead, dozens of different insights based on hundreds of experiments have con- 
tributed to the modern synthesis in biology. These advances occurred not by the 
instantaneous recognition of a major conceptual innovation, but by long, drawn-out 
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lation, argument, and criticism, for scientists to change their view about whether 
genes were made of protein or nucleic acid. The consensus of opinion changed, but 
not in one great leap. 

Science is a cumulative endeavor that respects the principle of connectivity. It 
is characterized by the participation of many individuals, whose contributions are 
judged by the extent to which they further our understanding of nature. No single 
individual can dominate discourse simply by virtue of his or her status. Science rejects 
claims of “special knowledge” available to only a few select individuals. This rejection, 
of course, follows from our discussion of the public nature of science in Chapter 1. By 
contrast, pseudosciences often claim that certain authorities or investigators have a 
“special” access to the truth. 

We have presented two ideas here that provide a useful context for understanding 
the discipline of psychology. First, no experiment in science is perfectly designed. There 
is a degree of ambiguity in the interpretation of the data from any one experiment. 
Scientists often evaluate theories not by waiting for the ideal or crucial experiment to 
appear, but by assessing the overall trends in a large number of experiments—each 
with different limitations. Second, many sciences have progressed even though they 
are without an Einstein. Their progress has occurred by fits and starts, rather than 
by discrete stages of grand Einsteinian syntheses. Also like psychology, many other 
sciences are characterized instead by growing mosaics of knowledge that lack a single 
integrating theme. 


Converging Evidence: Progress 


hesp 


itedFlaWMSea to a principle of evidence evaluation of much impor- 


tance in psychology. This idea is sometimes called the principle of converging evidence 
(or converging operations). Scientists and those who apply scientific knowledge must 
often make a judgment about where the preponderance of evidence points. When this 


is the case, the principle of converging evidence is an important tool. We will explore 
two ways of expressing the principle, one in terms of experiments with limitations and 
the other in terms of theory testing. 
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There are always a number of ways in which an experiment can go wrong (or 
become confounded, to use the technical term). However, a scientist with much experi- 
ence in working on a particular problem usually has a good idea of what the most 
likely confounding factors are. Thus, when surveying the research evidence, scien- 
tists are usually aware of the critical flaws in each experiment. The idea of converging 
evidence, then, tells us to examine the pattern of flaws running through the research 
literature because the nature of this pattern can either support or undermine the con- 
clusions that we wish to draw. 

For example, suppose the findings from a number of different experiments were 
largely consistent in supporting a particular conclusion. Given the imperfect nature of 
experiments, we would go on to evaluate the extent and nature of the limitations in 
these studies. If all the experiments were limited in a similar way, this circumstance 
would undermine confidence in the conclusions drawn from them because the con- 
sistency of the outcome may simply have resulted from a particular flaw that all the 
experiments shared. On the other hand, if all the experiments were limited in different 
ways, our confidence in the conclusions would be increased because it is less likely 
that the consistency in the results was due to a contaminating factor that confounded 
all the experiments. 

Each experiment helps to correct errors in the design of other experiments, and 
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no one experiment was perfectly designed. Thus, the principle of converging evidence 
urges us to base conclusions on data that arise from a number of slightly different 
experimental sources. The principle allows us to draw stronger conclusions because 
consistency that has been demonstrated in such a context is less likely to have arisen 
from the peculiarities of a single type of experimental procedure. 

The principle of converging evidence can also be stated in terms of theory testing. 
Research is highly convergent when a series of experiments consistently supports a 
given theory while collectively eliminating the most important competing theory. 
Although no single experiment can rule out all alternative explanations, taken collec- 
tively a series of partially diagnostic experiments can lead, if the data patterns line up 
in a certain way, to a strong conclusion. 

For example, suppose that five different theoretical accounts (call them A, B,C, D, 
and E) of a given set of phenomena exist at one time and are investigated in a series 
of experiments. Suppose that one experiment represents a strong test of theories A, B, 
and C, and that the data largely refute theories A and B and support C. Imagine also 
that another experiment is a particularly strong test of theories C, D, and E, and that 
the data largely refute theories D and E and support C. In such a situation, we would 
have strong converging evidence for theory C. Not only do we have data supportive 
of theory C, but we have data that contradict its major competitors. Note that no 
one experiment tests all the theories, but taken together, the entire set of experiments 
allows a strong inference. The situation might be depicted like the following: 


Theory A Theory B Theory C Theory D Theory E 


Experiment 1 refuted refuted supported untested untested 

Experiment 2 untested untested supported refuted refuted 
Theory A Theory B Theory C Theory D Theory E 

Conclusion refuted refuted supported refuted refuted 


By contrast, if both experiments represented strong tests of B, C, and E, and 
the data of both experiments strongly supported C and refuted B and E, the overall 
support for theory C would be less strong than in our previous example. The reason 
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is that, although data supporting theory C have been generated, there is no strong 
evidence ruling out two viable alternative theories (A and D). The situation would be 
something like the following: 


Theory A Theory B Theory C Theory D Theory E 
Experiment 1 untested refuted supported untested refuted 
Experiment 2 untested refuted supported untested refuted 

Theory A Theory B Theory C Theory D Theory E 
Conclusion untested refuted supported untested refuted 


Thus, research is highly convergent when a series of experiments consistently 
supports a given theory while collectively eliminating the most important competing 
explanations. Although no single experiment can rule out all alternative explanations, 
taken collectively a series of partially diagnostic experiments can lead to a strong con- 
clusion if the data converge in the manner of our first example. 

Finally, the introduction of the idea of converging evidence allows us to dispel a 
misconception that may have been fostered by our oversimplified discussion of falsifi- 
ability in Chapter 2. That discussion may have seemed to imply that a theory is falsi- 
fied when the first piece of evidence that disconfirms it comes along. This is not the 
case, however. Just as theories are confirmed by converging evidence, they are also 
disconfirmed by converging results. 

Knowledge of the principle of converging evidence is what leads retired physi- 
cian Harriet Hall (2013) to warn us to be skeptical of the phrase “new study shows” 
when we see it in the media or on the internet. You know the type of thing: New 
Study Shows that People Who Eat Kumquats Live 40 percent Longer. The reason to 
be skeptical should be clear to you by now: no single study shows anything! Before 
we derive a conclusion, many studies must be amalgamated together and we must 
assess whether they converge. Renowned cognitive psychologist Steven Pinker echoes 
this point: “There’s a habit among science journalists to treat a single experiment as 
something that is newsworthy. Buta single study proves very little. Readers have been 
led to expect shocking discoveries from a discipline that depends on slow, stutter-step 
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increments, and the media goes in leaps and bounds” (p. 19, Miller, 2016). 


Types of Converging Evidence 


The reason for stressing the importance of convergence is that conclusions in psychol- 
ogy are often based on the principle of converging evidence. There is certainly noth- 
ing unique or unusual about this fact (conclusions in many other sciences rest not 
on single, definitive experimental proofs, but on the confluence of dozens of fuzzy 


experiments). But there are reasons that this might be especially true of psychology. 
Experiments in psychology are usually of fairly low diagnosticity. That is, the data that 


support a given theory usually rule out only a small set of alternative explanations, 
leaving many additional theories as viable candidates. As a result, strong conclusions 
are usually possible only after data from a very large number of studies have been 


collected and compared. 


Better public understanding will come about if psychologists openly acknowledge 


this fact and then take pains to explain just what follows from it. Psychologists should 
admit that, although a science of psychology exists and is progressing, progress is 
slow, and our conclusions come only after a sometimes excruciatingly long period of 
research amalgamation and debate. 
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Media claims (whether in print, television, or on the internet) of breakthroughs 
should always engender skepticism, but this is especially true of psychological claims. 
For example, it sometimes seems like the media announces a new cure for autism 
about every three months. But these claims have been continuously occurring for over 
20 years now. Why are we still announcing a cure for autism when one was announced 
20 years ago?...and 19 years ago?...and 18 years ago?...etc. This of course suggests 
that the announcement 20 years ago was not a true cure at all. Perhaps it was a bogus 
claim. More likely, it was just a small step in the long road of scientific progress that will 
lead to a convergence of evidence regarding this condition. But these premature media 
reports wrongly imply that research on autism is noncumulative—that researchers are 
not slowly building knowledge, but instead are searching for a magic bullet. 

It is likewise with a research specialty area that I worked in during the early part of 
my career—the psychology of reading and reading disability. As with autism, a “cure” 
(magic bullet) for dyslexia has been announced in the media nearly yearly since about 
1990! For some examples, I haphazardly perused a bulging clip file of articles I have 
collected on these premature announcements and came up with the November 22, 
1999, issue of Newsweek magazine featuring on its cover a lead article titled: Dyslexia: 
New Hope for Kids Who Can’t Read (Kantrowitz & Underwood, 1999). And here is 
an article in the February 26, 2001, National Post (Canada) newspaper with an article 
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Brain Science Reveals (Gorman et al., 2003). And finally, fairly close to the present, 
here comes Newsweek again, on March 31, 2016, with an article titled Electric Shocks 
Help Dyslexic Children Read Faster (Cuthbertson, 2016). TIl stop there. There was no 
magic bullet in any of these articles. My point is not that the research reported in these 
articles was bad or wrong. The important point to understand though is that the media 
sources exaggerated the “magic bullet” nature of the studies reported. They were not 
“cures,” but instead were part of the slow progress that is definitely being made in the 
area of reading disability (Seidenberg, 2017). 

The media does the same thing with attention deficit hyperactivity disorder 
(ADHD)—announces startling new discoveries (magic bullets) vastly prematurely. 
This tendency to prematurely report breakthroughs in the media has been studied 
in the ADHD area. A group of researchers studied the ten most publicized scientific 
articles on ADHD over a ten year period in the 1990s (Gonon et al., 2012). These ten 
articles resulted in 347 newspaper reports (typical title: “Hyperactivity Linked to 
Genetic Defect”). The researchers then looked at the next decade of research to see 
if the ten findings replicated. What they found confirms our fears about premature 
reports in the media. Only two of the ten were strongly replicated. Six failed to repli- 
cate completely. Two others showed attenuated findings (findings not as strong as in 
the original report). In short, these studies did not deserve to be publicized as “break- 
throughs” or magic bullets. They were instead just small, confusing (and sometimes 
wrong) steps toward an eventual understanding of ADHD. Indeed, the premature 


hyping of studies such as these has, ironically, been termed “Journalistic Deficit 
Disorder” (Reporting Science, 2012). 

In psychology we have to walk a very fine line. For example, we must resist the 
temptation to regard a particular psychological hypothesis as “proven” when the evi- 
dence surrounding it is still ambiguous. This skeptical attitude has been reinforced in 
several chapters of this book. The cautions against inferring causation from correla- 
tion and against accepting testimonial evidence have served as examples. At the same 
time, we should not overreact to the incompleteness of knowledge and the tentative- 
ness of conclusions by doubting whether firm conclusions in psychology will ever be 
reached. Nor should we be tempted by the irrational claim that psychology cannot be 
a science. From this standpoint, the principle of converging evidence can be viewed 
as a counterweight to the warnings against overinterpreting tentative knowledge. 
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Convergence allows us to reach many reasonably strong conclusions despite the flaws 
in all psychological research. 

The best way to see the power of the principle of converging evidence is to exam- 
ine some areas in psychology where conclusions have been reached by the conver- 
gence of evidence. Let’s consider an example. A research problem that illustrates the 
importance of the principle of converging evidence is the question of whether exposure 
to violent television programming increases children’s tendencies toward aggressive 
behavior. There is now a scientific consensus on this issue: The viewing of violent pro- 
gramming (on television, in movies, or in streaming video) does appear to increase the 
probability that children will engage in aggressive behavior. The effect is not extremely 
large, but it is real. Again, the confidence that scientists have in this conclusion derives 
not from a single definitive study, but from the convergence of the results of dozens 
of different investigations (Bushman et al., 2016; Carnagey et al., 2007; Feshbach & 
Tangney, 2008; Fischer et al., 2011b). This research conclusion holds for violent video 
games as well as television and movies (Calvert et al., 2017; Carnagey et al., 2007), but 
the effect there seems to be fairly small as well (Ferguson, 2013; Furuya-Kanamori & 
Doi, 2016). The general research designs, subject populations, and specific techniques 
used in these investigations differed widely, and as should now be clear, these differ- 
ences are a strength of the research in this area, not a weakness. 
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on a campaign of misinformation that capitalizes on the public’s failure to realize that 
research conclusions are based on the convergence of many studies rather than on 
a single critical demonstration that decides the issue (Seethaler, 2009). The television 
networks and video game makers continually single out individual studies for criti- 
cism and imply that the general conclusion is undermined by the fact that each study 
has demonstrated flaws. But it is not commonly recognized that researchers often can- 
didly admit the flaws in a given study. The critical difference is that researchers reject 
the implication that admitting a flaw in a given study undermines the general scien- 
tific consensus on the effects of televised violence on aggressive behavior. The reason 
is that the general conclusion derives from a convergence. Research without the spe- 
cific flaws of the study in question has produced results pointing in the same direction. 
This research may itself have problems, but other studies have corrected for these and 
have also produced similar results. 

For example, very early in the investigation of this issue, evidence of the correla- 
tion between the amount of violent programming viewed and aggressive behavior in 
children was uncovered. It was correctly pointed out that this correlational evidence 
did not justify a causal conclusion. Perhaps a third variable was responsible for the 
association, or perhaps more aggressive children chose to watch more violent pro- 
gramming (the directionality problem). 

But the conclusion of the scientific community is not based on this correlational 
evidence alone. There are more complex correlational techniques than the simple mea- 
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surement of the association between two variables, and these correlational techniques 
allow some tentative conclusions about causality (one, that of partial correlation, was 
mentioned in Chapter 5). One of these techniques involves the use of a longitudinal 
design in which measurements of the same two variables—here, television violence 
and aggression—are taken at two different times. Certain correlational patterns sug- 
gest causal connections. Studies of this type have been conducted, and the pattern of 
results suggested that viewing violent programming did tend to increase the probabil- 
ity of engaging in aggressive behavior later in life. 

Again, it is not unreasonable to counter that these longitudinal correlational 
techniques are controversial, because they are. The important point is that the con- 
clusion of a causal connection between televised violence and aggressive behavior 
does not depend entirely on correlational evidence, either simple or complex, because 
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numerous laboratory studies have been conducted in which the amount of televised 
violence was manipulated rather than merely assessed. In Chapter 6, we discussed 
how the manipulation of a variable, used in conjunction with other experimental con- 
trols such as random assignment, prevents the interpretation problems that surround 
most correlational studies. If two groups of children, experimentally equated on all 
other variables, show different levels of aggressive behavior, and if the only difference 
between the two is that one group viewed violent programming and one did not, then 
we are correct in inferring that the manipulated variable (televised violence—the inde- 
pendent variable) caused the changes in the outcome variable (aggressive behavior— 
the dependent variable). This result has occurred in the majority of studies. 

These studies have prompted some to raise the “it’s-not-real-life” argument dis- 
cussed in the previous chapter and to use the argument in the fallacious way discussed 
in that chapter. In any case, the results on the effects of television violence are not 
peculiar to a certain group of children because these results have been replicated in 
different regions of the United States and in several countries around the world. The 
specific laboratory setup and the specific programs used as stimuli have varied from 
investigation to investigation, yet the results have held up. 

Importantly, the same conclusions have been drawn from studies conducted in 
the field rather than in the laboratory. A design discussed in Chapter 6, known as 
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necessary link between experimental design and experimental setting. People some- 
times think that studies that manipulate variables are conducted only in laboratories 
and that correlational studies are conducted only in the field. This assumption is 
incorrect. Correlational studies are often conducted in laboratories, and variables 
are often manipulated in nonlaboratory settings. Although they sometimes require 
considerable ingenuity to design, field experiments (several of which were men- 
tioned in Chapter 6), in which variables are manipulated in nonlaboratory settings, 
are becoming more common in psychology. 

Of course, field experiments themselves have weaknesses, but many of these 
weaknesses are the strengths of other types of investigation. In summary, the evi- 
dence linking the viewing of televised violence to increased probabilities of aggressive 
behavior in children does not rest only on the outcome of one particular study or even 
on one generic type of study. 

The situation is analogous to the relationship between smoking and lung cancer. 
Smokers are 15 times more likely to die from lung cancer than nonsmokers (Gigerenzer 
et al., 2007). In the past, cigarette company executives often attempted to mislead the 
public by implying that the conclusion that smoking causes lung cancer rested on 
some specific study, which they would then go on to criticize (Offit, 2008). Instead, 
the conclusion is strongly supported by a wealth of converging evidence. The conver- 
gence of data from several different types of research is quite strong and will not be 
changed substantially by the criticism of one study. 


Actually it is appropriate to discuss here a medical problem like the causes of 
lung cancer. Most issues in medical diagnosis and treatment are decided by an amal- 
gamation of converging evidence from many different types of investigations. For 
example, medical science is confident of a conclusion when the results of epidemio- 
logical studies (field studies of humans in which disease incidence is correlated with 
many environmental and demographic factors), highly controlled laboratory studies 
using animals, and clinical trials with human patients all converge. When the results 
of all these types of investigation point to a similar conclusion, medical science feels 
assured of the conclusion, and physicians feel confident in basing their treatment on 
the evidence. 

However, each of the three different types of investigation has its drawbacks. 
Epidemiological studies are always correlational, and the possibility of spurious links 
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between variables is high. Laboratory studies can be highly controlled, but the subjects 
are often animals rather than humans. Clinical trials in a hospital setting use human 
subjects in a real treatment context, but there are many problems of control because of 
placebo effects and the expectations of the medical treatment team that deals with the 
patients. Despite the problems in each type of investigation, medical researchers are 
justified in drawing strong conclusions when the data from all the different methods 
converge strongly, as in the case of smoking and lung cancer. Just such a convergence 
also justifies the conclusions that psychologists draw from the study of a behavioral 
problem like the effect of televised violence on aggressive behavior. 

Sometimes the principle of converging evidence is unknown to people. Other 
times it seems to be consciously ignored in order to advance a political agenda or an 
agenda of financial advancement. Certainly the cigarette company experts and senior 
executives who tried to confuse the public’s understanding of the converging evidence 
that smoking caused lung cancer probably were aware of the convergence principle 
and wished to obscure it from the public. 

An example similar to the smoking/lung cancer case is occurring right at the 
present time. There is a strong convergence in science indicating that talking on a cell 
phone while driving (as well as the distraction from electronic dashboard devices 
while driving) is extremely dangerous and an important cause of car crashes (even 
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in cognitive science. Yet cell phone companies and auto companies—like the ciga- 
rette companies before them—are attempting to obscure from the public the fact that 
the science surrounding this conclusion is highly convergent (Insurance Institute for 
Highway Safety, 2005; Kunar et al., 2008; Levy et al., 2006; McEvoy et al., 2005; Richtel, 
2014; Strayer et al., 2016; Strayer & Drews, 2007). The technology companies and 
automobile companies ignore the science even more when they try to get a competi- 
tive edge by installing more interactive electronic features in cars. Apple’s CarPlay 
technology and Google’s Android Auto are particularly troublesome developments 
(Chaker, 2016; White, 2014), given the science on driver distraction. The technology 
companies and automobile companies continue to ignore the science of driver risk. 
These technology-caused deaths are preventable with electronic fixes now available, 
but these modern companies continue to act like the cigarette companies did years 
ago in their reluctance to deal with known consumer risks (Leonhardt, 2017). 


Scientific Consensus 


The problem of assessing the impact of televised violence is typical of how data finally 
accumulate to answer questions in psychology. Particularly in areas of pressing social 
concern, it is wise to remember that the answers to these problems emerge only slowly, 
after the amalgamation of the results from manv different experiments. To put things 
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in the form of a simple rule, when evaluating empirical evidence in the field of psy- 
chology, think in terms of scientific consensus rather than breakthrough—in terms of 
gradual synthesis rather than great leap. 

The failure to appreciate the “consensus rather than breakthrough” rule has impeded 
the public’s understanding of the evidence that human activity is a contributor to global 
warming (Cook, 2016; Powell, 2015). In fact, there is not much scientific controversy 
about this conclusion (in its broadest sense), because the conclusion does not rest on a 
single study. There were over 900 global climate-change papers published between 1993 
and 2003, and they overwhelmingly converged on the conclusion that human activity 
was involved in global warming (Oreskes & Conway, 2011). No single study was defini- 
tive in establishing the conclusion so, obviously, undermining a single study would not 
change the conclusion at all. Note, however, that establishing the conclusion in a broad 
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sense does not necessarily dictate what should be done in response to the conclusion. 


What is to be done is a political judgment. The fact itself is in the realm of science. The 
fact does not necessarily dictate a particular policy response—or any response at all. 


Methods and the Convergence Principle 


IA exnect manv ditterent meth- 


Psychology 


has long been criticized for relying too heavily on laboratory-based experimental 
techniques. Nevertheless, an unmistakable trend in recent years has been to expand 
the variety of methods used in all areas of psychology. Researchers have turned to 
increasingly imaginative field designs in search of converging evidence to support 
their theories. 

As an example, consider the voluminous research done on what has been called 
the |) that is, ee 


(Fischer 
et al., 2011a; Thomas et al., 2016). Probability of helping can sometimes go down as 


more potential helpers are present. The early investigators of this phenomenon were 
well aware that their conclusions would be tenuous if they were based only on the 
responses of individuals who witnessed emergencies after reporting to a laboratory 
to participate in an experiment. Therefore, in an early famous study of this effect, 
researchers found a cooperative liquor store that agreed to have fake robberies occur 
in the store 96 different times. While the cashier was in the back of the store getting 
some beer for a “customer,” who was actually an accomplice of the experimenter, the 
“customer” walked out the front door with a case of beer. This was done in the view 
of either one or two real customers who were at the checkout counter. The cashier then 
came back and asked the customers, “Hey, what happened to that man who was in 
here? Did you see him leave?” thus giving the customers a chance to report the theft. 
Consistent with the laboratory results, the presence of another individual inhibited the 
tendency to report the theft. 

Many of the principles of probabilistic decision making to be discussed in 
Chapter 10 originated in the laboratory but have also been tested in the field. For 
example, researchers have used laboratory-derived principles to explain the way that 
physicians, stockbrokers, jurors, economists, and gamblers reason probabilistically in 
their environments (Kahneman, 2011; Lewis, 2017; Thaler, 2015; Zwieg, 2008). The con- 
vergence of laboratory and nonlaboratory results has also characterized several areas 
of educational psychology. For example, both laboratory studies and field studies of 
different curricula have indicated that early phonics instruction facilitates the acquisi- 
tion of reading skill (Ehri et al., 2001; Seidenberg, 2017; Willingham, 2017). 


It should be remembered that 2222323300000 


ee 
that the hypothesis that was originally posited cannot be 
supported. ll 


20000002, It has long been thought there was a way for teachers to measure each 
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20000). Inany case, teachers are then supposed to be able to “teach to” these styles— 
resulting in higher achievement for all. (It is sometimes also claimed that students will 


all achieve much more equally if this is done.) 0000 
(Hood, 2017; Kirschner & van 


Merrienboer, 2013; Pashler et al., 2009). 
r / 


Avoiding the Einstein Syndrome: The Importance of Converging Evidence 95 


The Progression to More Powerful Methods 


For example, interest in a particular 


hypothesis may originally stem from a particular case study of unusual interest. As we 
discussed in Chapter 4, this is the proper role of case studies: to suggest hypotheses for 


rigorous methods to a reksarch probfem thas Tollowing the case studies, Bledrchers 
undertake correlational investigations to verify whether the link between variables is 
real rather than the result of the peculiarities of a few case studies. If the correlational 
studies support the relationship between relevant variables, researchers will attempt 
experiments in which variables are manipulated in order to isolate a causal relation- 
ship between the variables. 

ey then, UUUUUUUUTUTTTTTTVTTVTYTTTTTT 
What type of study is most appropriate often 


depends on how advanced the research problem is. Past-president of the Association 
for Psychological Science Doug Medin (2012) reminds us that “some well-established 
areas of research may be like Phase III clinical trials, in which the methods and mea- 
sures are settled issues and the only concern is with assessing effect size. Other areas, 
however, may rely on open-ended tasks in which the dependent variable cannot and 
typically should not be specified in advance” (p. 6). 

Discussing the idea of the progression through the more powerful research meth- 
ods provides us with a chance to deal with a misconception that some readers may 


have derived from Chapter 5—that is, that correlational studies are not useful in 


Chapter 5 the complex correlational technique of partial correlation, in which it is pos- 
sible to test whether a particular third variable is accounting for a relationship. 


Perhaps most important, however, 2200000000000 
es. 


UU. This circumstance, again, is not unique to psychol- 


ogy. Astronomers obviously cannot manipulate all the variables affecting the objects 
they study, yet they are able to arrive at conclusions. 


An example of the evolution of research methods in health psychology is the work 
concerning the link between the type A behavior pattern and coronary heart disease 
(Chida & Hamer, 2008; Martin et al., 2011; Matthews, 2013). The original observations 
that led to the development of the concept of the type A behavior pattern occurred 
Woe Rede a Sanco of Baie unpency: hee Hosting Rolig, 
and extremely competitive striving for achievement. Thus, the idea of the type A per- 
sonality originated in a few case studies made by some observant physicians. These 
case studies suggested the concept, but they were not taken as definitive proof of the 
hypothesis that a particular type of behavior pattern is a partial cause of coronary 
heart disease. Proving the idea required more than just the existence of a few case 


studies. It involved decades of work by teams of cardiologists and psychologists. 


96 Chapter 8 


The research quickly moved from merely accumulating case studies, which could 
never establish the truth of the hypothesis, to more powerful methods of investigation. 
Researchers developed and tested operational definitions of the type A concept. Large- 
scale epidemiological studies established a correlation between the presence of type 
A behavior and the incidence of coronary heart disease. The correlational work then 
became more sophisticated. Researchers used complex correlational techniques to track 
down potential third variables. The relation between type A behavior and heart attacks 
could have been spurious because the behavior pattern was also correlated with one of 
the other traditional risk factors (such as smoking, obesity, or serum cholesterol level). 
However, results showed that type A behavior was a significant independent predictor 
of heart attacks. When other variables were statistically partialed out, there was still a 
link between the type A behavior pattern and coronary heart disease. 

Finally, researchers undertook experimental studies with manipulated variables 
to establish whether a causal relationship could be demonstrated. Some of the stud- 
ies attempted to test models of the physiological mechanisms that affected the rela- 
tionship and used animals as subjects—what some might call “not real life.” Another 
experimental study used human subjects who had had a heart attack. These sub- 
jects were randomly assigned to one of two groups. One group received counseling 
designed to help them avoid traditional risky behavior such as smoking and eating 
fatty foods. The other group received this counseling and were also given a program 

esigned to help them reduce their type A behavior. Three years later, there had been 
significantly fewer recurrent heart attacks among the patients given the type A behav- 
ior counseling. 

In short, the evidence converged to support the hypothesis of the type A behavior 
pattern as a significant causal factor in coronary heart disease. The investigation of 


this problem provides a good example of how naaa a a aa S 
a 


Another lesson we can draw from this exampleisthat = S i — 


an issue first raised in Chapter 3 when we discussed operational definitions. Recent 


research seems to indicate that it is oversimplifying to talk about the connection 
between heart attacks and the type A behavior pattern as a whole. The reason is that 


Belynectris cororonenta nf dherstenitpaciqgady sadssenttiehesshts) anmsante 
have an example of how science uncovers increasingly specific relationships as it pro- 
gresses and how theoretical concepts become elaborated. 

There is a final point to note in our discussion of scientific consensus. EEE 


ooo 
ee 
I (Wyse, 2017). Scientists are of 


course free to sign whatever petition they want about whatever social or political issue, 
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put such a document 1s not What we are talking about ın this Chapter when We talk 
about a scientific consensus. We will see in Chapter 12 that the American Psychological 
Association itself has been guilty of taking positions on social issues that are only loosely 
connected (or not connected at all) to the science in the journals that it publishes. 


A Counsel Against Despair 


One final implication of the convergence principle isthat = | 
See, eee 
= i Verf Fee Pe 


25. At first, the blur on the screen could represent just about anything. Then, 
as the slide is focused a bit more, many alternative hypotheses may be ruled out even 
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though the image cannot be identified unambiguously. Finally, an identification can be 


made with great confidence. Ty 


ee 
Thus, ee 
Nor is such a situation unique to psychol- 
ogy. It also occurs in more mature sciences. The contradictions may be simply chance 
occurrences (something we will discuss at length in Chapter 11), or they may be due to 


subtle methodological differences between experiments. EEE 


ee 
I (p. 92, Kachka, 2012). 


Many other sciences have endured confusing periods of uncertainty before a con- 
sensus was achieved (Lewis, 2017; Novella, 2015). Medical science certainly displays 
this pattern all the time. For example, research into aspirin’s role as a cancer preven- 


tative has been extremely confusing, uncertain, and nonconverging. Aspirin fights 
inflammation by inhibiting substances known as cyclooxygenase, or COX, enzymes. 
Because COX enzymes also are involved in the formation of some cancerous tumors, it 
was thought that daily aspirin might also inhibit this effect. But actual research on this 
speculation has produced inconsistent results. Some researchers think that the incon- 
sistency has to do with the fact that the optimal dosage level has not yet been found. 

Writer Malcolm Gladwell (2004), in an article titled “The Picture Problem,” dis- 
cusses how people have difficulty understanding why the medical profession still has 
disagreements about the degree of benefit derived from mammograms (Beck, 2014; 
Reddy, 2016; University of California, 2016). This is because a mammography picture 
seems so “concrete” to most people that they think it should be determinative. They 
fail to understand that human judgment is necessarily involved, and that mammog- 
raphy assessment and disease prediction are inherently probabilistic (Gigerenzer et al., 
2007). However, Gladwell, goes on to note that in this area of medicine—just as in 
Psychology ng en. 

In psychology and many other sciences, is 
er) 
I (Braver et al., 2014; Card, 2011; Schmidt & Oh, 


[he a ar] 
ii sqyecatstucigsthabaddness she perne 


group is compared with another are expressed in a common statistical metric that 


allows comparison of effects across studies. EEE 
SS... i some cases, of course, 


no conclusion can be drawn with confidence, and the result of the meta-analysis is 
inconclusive. 
More and more commentators are calling for a greater emphasis on meta-analysis 


as a way of dampening the contentious disputes about conflicting studies in the behav- 


ioral sciences. 
oo An emphasis on meta-analysis has often revealed 


that we actually have more stable and useful findings than is apparent from a perusal 
of the conflicts in our journals. 

The National Reading Panel (2000; Ehri et al., 2001) found just this in their 
meta-analysis of the evidence surrounding several issues in reading education. For 
example, they concluded that the results of a meta-analysis of the results of 38 different 
studies indicated “solid support for the conclusion that systematic phonics instruction 
makes a bigger contribution to children’s growth in reading than alternative programs 
providing unsystematic or no phonics instruction” (p. 84). In another section of their 
report, the National Reading Panel reported that a meta-analysis of 52 studies of pho- 
nemic awareness training indicated that “teaching children to manipulate the sounds 
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in language helps them learn to read. Across the various conditions of teaching, test- 
ing, and participant characteristics, the effect sizes were all significantly greater than 
chance and ranged from large to small, with the majority in the moderate range” (p. 5). 


|. It is through meta-analysis that we know that married people are happier than 
never married people, and that marriage leads to better health outcomes (Myers, 2015, 
2017; Robles et al., 2014). It is through meta-analysis that we know that the “brain train- 
ing” programs advertised on television, radio, and on the web do not work. Although 
people improve on the specific tasks they train on in these programs, the programs do 
not improve long-term general cognitive functioning, nor do they have lasting effects 
on real-world outcomes (Simons et al., 2016). It is through meta-analysis that we know 
that the personality trait of conscientiousness is related to job performance (Schmidt 
& Oh, 2016). Many of the studies in the job performance literature on this issue had 
found nonsignificant results, but when a large number of these studies were amalgam- 
ated together, a meta-analysis indicated that there was indeed a modest association. 
It is through meta-analysis that we know that our ability to predict suicide has not 
improved in 50 years (Franklin et al., 2017). 


oo. In an earlier chapter, we discussed how 


personal anecdotes had impeded patients and doctors from implementing the recom- 
mendation of U.S. Preventive Services Task Force (USPSTF) that the prostate-specific 


antigen (PSA) test to screen for prostate cancer not be used (Arkes & Gaissmaier, 2012). 
The USPSTF’s review of the scientific evidence indicated that the harms associated 
with the test (the side effects associated with unnecessary treatment) outweighed the 
benefits in mortality (which were tiny at best, and perhaps nonexistent). Their recom- 
mendation was heavily based on meta-analytic results. 

It is likewise in the domain of health psychology. Chida and Hamer (2008) meta- 
analyzed data from a whopping 281 studies relating the hostility and aggression 
aspects of the Type A behavior pattern to cardiovascular reactivity (heart rate and 
blood pressure) in order to establish that there was indeed a relationship. As another 
example, Currier, Neimeyer, and Berman (2008) meta-analyzed 61 controlled studies 
of psychotherapeutic interventions for bereaved persons. Their meta-analysis had a 


iRARR CES EIER ho tad SRS Re eRe tO Aan dm Tale 
come from this meta-analysis helps to remind us that the outcome of a meta-analysis 
is not always positive. That is, it does not always tell us that, from a series of widely 
varying studies, something is there. Just as often it tells us that, when we amalgamate 
the results from a large number of varying studies—nothing is there! 


E] 
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Summary 


In this chapter, we have seen how the breakthrough 
model of scientific advance is a bad model for psychol- 
ogy and why the gradual-synthesis model provides a 
better framework for understanding how conclusions 


are reached in psychology. ees 
De a ee 


Chapter 9 


fdrethéisdtadecnedect 
The Issue of Multiple 
Causation 


Learning Objectives 


9.1 Explain the concept of interactions between variables in 
psychological research 


9.2 Outline the difficulties in acknowledging multiple causation to a 
phenomenon 


In Chapter 8, we focused on the importance of converging operations and the need to 
progress to more powerful research methods in order to establish a connection between 
variables. In this chapter, we go beyond a simple connection between two variables to 


highlight an important point: IT En. 


3. 
researchers have found a negative relationship between amount of television and other 
media viewing and academic achievement, but they do not claim that the amount of 
media viewed is the only thing that determines academic achievement. That, of course, 
would be silly, because academic achievement is partially determined by a host of other 
variables (home environment, quality of schooling, cognitive ability, and the like). In 
fact, media viewing is only a minor determinant of academic achievement when com- 
pared with these other factors. Likewise, Jaffee et al. (2012) examined the literature on 


the potential causes of antisocial behavior in youth. The evidence converged on several 
different factors as causal, including: peer deviance, living in a divorced household, 
parental depression, adolescent motherhood, coercive discipline, and poverty. 


Furuya-Kanamori & Doi, 2016). But 


es. 
Like so many of the other principles discussed in this book, it is important to 
put the idea of multiple causes in perspective. On the one hand, 000. 


The world is complicated, and the 
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determinants of behavior are many and complex. 


On the other hand, Berger 


SSS 
Second, | 


|». Few would argue that a variable that could reduce the number of acts of physical 
violence by as much as 1 percent annually is not of enormous importance. In short, if 
the behavior in question is of great importance, then knowing how to control only a 
small proportion of it can be extremely useful. 

There have been medical studies in which a treatment accounted for less than 
1 percent of the variability in the outcome, yet the results were considered so startlingly 
positive that the study was terminated prematurely for ethical considerations—that is, 
the outcome of the experiment was considered so strong that it was deemed unethical 
to withhold the treatment from the placebo group (Ferguson, 2009; Rosenthal, 1990). 


Likewise, any factor that could cut motor vehicle deaths by just 1 percent would be 
immensely important—it would save over 250 lives each year. Reducing the homicide 


rate by just 1 percent would save over 140 lives each year. In short, =  \ 


The Concept of Interaction 


. This is called (I 
one va 


concept ¢ Of interaction really ne needs to belearne indep fi 
(and with numbers), and statistics texts cover it in detail. We can only mention the con- 
cept and give some quick examples here, so this chapter will be a short one. 


Evans et al., 
2013). 
es. ee 
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02%. For example, researchers might be studying the academic achievement of 
adolescents to see whether it is a function of life changes such as school transition, 
pubertal development, residential mobility, and family disruption. It would not be 
uncommon to find that no single factor had a huge effect, but that when several of 
these life changes were conjoined together, they resulted in a substantial fall-off in 
academic achievement. 

To understand the logic of what is happening when an interaction such as this 
occurs, imagine a risk scale where a score of 80-110 represents low risk, 110-125 mod- 
erate risk, and 125-150 high risk. Imagine that we had found an average risk score of 
82 for children with no stressors, an average risk score of 84 for children with stress 
factor A, and an average risk score of 86 for children with stress factor B. An interac- 
tion effect would be apparent if, when studying children with both risk factor A and 
risk factor B, we found an average risk score of 126. That is, the joint risk when two risk 
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factors were conjoined was much greater than what would be predicted from studying 
each risk factor separately. 

Grant et al. 
(2015) found that exposure to a synthetic stress hormone (synthetic glucocorticoids) only 
had a negative effect on children’s cognitive functioning if the children also experienced 
sociodemographic adversity (low maternal education, birth when mother under 18, low 
income, mother a single parent). The stress hormone did not impair cognitive function- 
ing as long as the child did not experience any sociodemographic adversity. 


i For example, variations in the so-called 


5-HTT gene have been found to be related to major depression in humans (Hariri & 
Holmes, 2006). People with one variant (the S allele) are more likely to suffer from 
major depression than people with the other variant of the gene (the Lallele). However, 
this greater risk for those with the S allele is only true for those who have also suffered 
multiple traumatic life events, such as child abuse or neglect, job loss, and/or divorce. 
Bee 
(Dodge & Rutter, 2011). The relationship between variants of the mono- 
amine oxidase A (MAOA) gene and antisocial behavior provides an example. One 
variant of the gene increases the probability of antisocial behavior, but only if other 
risk factors are present, such as child abuse, birth complications, or negative home 
environments (Raine, 2008). 

Often it is two psychological characteristics that display an interaction. An example 
is provided by research on the link between rumination and depression. The tendency to 
ruminate does predict the duration of depressive symptoms, but it interacts with cogni- 
tive styles—rumination predicts lengthened periods of depressive symptoms only when 
conjoined with negative cognitive styles (Nolen-Hoeksema et al., 2008). 

Components of programs may have interactive effects. Developmental psy- 
chologist Dan Keating (2007) has reviewed the literature on the consequences of 
states’ Graduated Driver Licensing programs on teen driver safety. These programs 
work—they lower the rate of teen auto crashes and teen auto fatalities. However, 
the programs are all different from state to state, each state having somewhat dif- 
ferent subsets of several basic components: required driver education, passenger 
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whether each of these components is causally effective and whether they have any 
interactive effects. Research indicates that no one of the components lowers teen 
crash or fatality rates. However, in combination they can lower the number of teen 
fatalities by over 20 percent. 


Thus, M eee 
First, Second, a 
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Clinical psychologist Scott Lilienfeld (2006) discusses the continuum of causal 


influence for variables—from strong to weak. 


Chapter 9 


The Temptation of the Single-Cause 
Explanation 


It seems that the basic idea that complex events in the world are multiply determined 
should be an easy one to grasp. In fact, the concept is easy to grasp and to apply when 
the issues are not controversial. However, when our old nemesis, preexisting bias (see 
Chapter 3), rears its head people have a tendency to ignore the principle of multi- 
ple causation. How many times do we hear people arguing about such emotionally 
charged issues as the causes of crime, the distribution of wealth, the causes of pov- 
erty, changes in marriage rates, and the effect of capital punishment in a way that 
implies that these issues are simple and unidimensional and that outcomes in these 
areas have a single cause? These examples make it clear that people will sometimes 
acknowledge the existence of multiple causes if asked directly about multiple causes; 
but seldom will they spontaneously offer many different causes as an explanation for 
something they care about. Most often, people adopt a “zero sum” attitude toward 
potential causes—that all causes compete with one another and that emphasizing one 
necessarily reduces the emphasis on another. 

Under emotional influence, we tend to forget the principle of multiple causation. 
cal spectrin. Yberafs may argue that people of fow-socloecbhomiic status who com 
mit crimes may themselves be victims of their circumstances (e.g., joblessness, poor 
housing, poor education, and lack of hope about the future). Conservatives may reply 
that a lot of poor people do not commit crimes; therefore, economic conditions are 
not the cause. Instead, the conservative may argue, it is personal values and personal 
character that determine criminal behavior. Neither side in the debate ever seems 
to acknowledge that both individual factors and environmental factors contribute to 


criminal behavior. 


Consider also discussions of the causes of complex economic outcomes. These 
outcomes are hard to predict precisely because they are multiply determined. For 
example, economic debate has focused on a problem of the last several decades with 


important social implications: |) 
i Crooks, 


2008; Caldwell, 2016; Conard, 2016; Fairless, 2017; Lemann, 2012; Murray, 2012). 


oo What have economic studies found 
with respect to these many alternative causes? You guessedit. 0000000000 


gg It also appears 
that. For example, i 


It can be particularly difficult to acknowledge multiple causation when the 
different potential causes are attached to different ideological positions. Then, 
it becomes very tempting for each side to promote its own cause (or causes) and 
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dismiss the causes of their ideological opponents. The causes of poverty provide a 
prime example of this tendency—with liberals and conservatives having different 
favorite causal models and tending to denigrate the causal models of the other side. 
The answer to this impasse is for each side to acknowledge multiple causation at the 
outset and to concede that some of the multiple causes probably include those of 
their ideological opponents. 

Psychologist Jonathan Haidt of New York University participated in an endeavor 
that attempted to break this impasse by truly acknowledging multiple causation 
(AEI/Brookings Working Group on Poverty, 2015). A group of experts from across the 
ideological spectrum agreed to come up with a set of solutions for poverty that were 
consensus positions—endorsed by both sides of the ideological divide. Striving for 
consensus necessarily meant that members of the panel would have to endorse solu- 
tions to poverty that came from their ideological opponents. The group was largely 
successful in producing a report that truly acknowledged multiple causation when it 
came to their policy recommendations. For example, the liberals on the panel agreed 
that the data supported the conservative policy recommendation of promoting a new 
cultural norm surrounding parenthood and marriage, as well as the policy recommen- 
dation of promoting delayed, responsible childbearing. Likewise, the conservatives on 
the panel agreed that the data supported the liberal policy recommendation of making 


work pay better for the less educated and for ensuring that jobs were available. The 
consensus recommendations thus embodied the principle of multiple causation. 


Like economic problems, vn 
Take the problem of learning disabilities, for 
example, which educational psychologists, cognitive psychologists, and developmen- 
tal psychologists have investigated extensively, 
os (Peterson & Pennington, 2012; 
Seidenberg, 2017; Tanakaetal.,2011), a 
ENTE (Peterson & Pennington, 2012). MT 
Nee 
The reason it would be wrong is 
that E 
EEE u ge ee 


GE EEE. 

A similar situation characterizes the causes and treatment of depression. 
0005. Likewise, a multiplicity of treatments combined—medication plus psycho- 
therapy—seems to result in the best therapeutic outcome (Engel, 2008). 

Once the multiple causes of a complex phenomenon are found, if the phenom- 
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enon 1s a problem, this Necessarily means that the solution to the problem wii require 
multiple interventions. Decades ago we had a major health problem—an epidemic 
of smoking, a habit linked to many diseases. In recent decades, various interventions 
have reduced the level of smoking in our society: Tobacco advertising was banned, 
tobacco taxes were raised, the nicotine patch became available, smoking was banned 
in public places, and many more interventions were instituted (Brody, 2011). Slowly, 
over decades, the rate of smoking went down because of these multiple interventions 
targeted at its many causes. 

Just as it took many different interventions to reduce smoking years ago, it will 
take multiple societal interventions to halt and reverse our current national epi- 
demic of obesity (King, 2013; Taubes, 2017). The reason is that our current obesity 
epidemic started a couple of decades ago because of many different trends coinciding: 
Decreased walking, fewer meals were prepared at home when more women entered 
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the workforce, the fast-food industry exploded in size, food advertising became ubiq- 
uitous, electronic entertainment made children sedentary, portions increased in size, 
along with many other factors (Hewer, 2014; University of California, 2015b). The 
solution to this national problem will have to be correspondingly multifaceted. The 
Wellness Newsletter of the University of California warns that “it’s overly simplistic 
to blame the obesity epidemic solely on people eating too much because of lack of 
willpower and sedentary lifestyles. If there ever was a multifactorial condition, obesity 
is it—a complex of interacting genetic, metabolic, behavioral, hormonal, psychologi- 
cal, cultural, environmental, and socioeconomic factors” (p. 1, University of California, 
2015b). Science writer Gina Kolata (2016b) puts it even simpler by titling her article on 
obesity: No Single Answer. 


Summary 


Chapter 10 


ShpLAGIH Sobleelon: 


Probabilistic Reasoning 


Learning Objectives 


10.1 Describe how people use “person-who” arguments to refute 


De findings 
10.2 Explain how probabilistic prediction means learning to live with 


some uncertainty 


10.3 Outline some pitfalls in dealing with probability, including the 
gambler’s fallacy and ignoring sample size 


Question: 
Men are taller than women, right? 


Answer: 
“Right.” 


Question: 
All men are taller than all women, right? 


Answer: 
“Wrong.” 


Correct. Believe it or not, we are going to devote part of this chapter to something that 
you just demonstrated you knew by answering the previous two questions. But don’t 
skip the chapter just yet, because there are some surprises waiting in the explanation 


of what seems like a very simple principle. 


You correctly interpreted the statement as reflecting a proba- 
bilistic trend rather than a fact thatholds in every singleinstance. = 


That 


is, 
Sa 
tends to be warmer near the equator. Families tend to have fewer than eight children. 
Most parts of the earth tend to have more insects than humans. These are all statisti- 
cally demonstrable trends, yet there are exceptions to every one of them. 
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Before his death of lung cancer, neurosurgeon Paul Kalanithi (2016) wrote a mov- 
ing book about dealing with the disease in the latter days of his life. In his book, he 
discussed how doctors present prognoses to patients, and was particularly harsh on 
physicians who did not emphasize to patients that prognoses were probabilistic. He 
recommended presenting patients with intervals (“Most patients live many months 
to a couple of years”) rather than specific best guesses (“Median survival is eleven 
months”). He felt that phrases like “most patients live many months to a couple of 
years” more fundamentally portrayed the probabilistic nature of prediction. 

Americans received a sad lesson in the probabilistic nature of medical knowledge 
in the summer of 2008 when much-loved political broadcaster Tim Russert died of 
a heart attack at age 58. Russert took cholesterol pills and low-dose aspirin, rode an 
exercise bike, and had yearly stress tests, yet he still died early of a heart attack. The 
fact that he had been fairly vigilant toward his health led many readers of the New York 
Times to write in saying that the doctors must have missed something. These readers 
did not understand that medical knowledge is probabilistic. Every failure to predict 
is not a mistake. In fact, his doctors missed nothing. They applied their probabilistic 
knowledge as best they could—but this does not mean that they could predict indi- 
vidual cases of heart attack. Science writer Denise Grady (2008) tells us that, based on 
his stress test and many other state-of-the-art diagnostics that Mr. Russert was given in 
his last exam, the doctors estimated—from a widely used formula—that Mr. Russert’s 
probability of a heart attack in the next ten years was 5 percent. This means that 95 
out of 100 people with Mr. Russert’s medical profile should not have a heart attack in 
the next ten years. Mr. Russert was just one of the unlucky five—and medical science, 
being probabilistic, cannot tell us in advance who those unlucky five will be. 

The Tim Russert example provides an opportunity to emphasize that. == 


- l o O L Here is what we mean by this:  , 
But after they 
are dead, those five people most definitely do have names. For example, Tim Russert 
turned out to be one of the five. He is no less dead than he would be if we could have 
named him in advance. We must get over this feeling that, because of its numerical 
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20000005. Recall from Chapter 8 the point that, because of cell phone talking and tex- 
ting in cars, hundreds of Americans will die unnecessarily in crashes in the upcoming 


year. Because this is a probabilistic prediction, I cannot tell you who these Americans 
will be. However, the prediction is no less real just because it is probabilistic. 


a E BEN 
en, Science writer Natalie Angier (2007) discusses 
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how some people think that seismologists really can predict individual earthquakes, 
but that they do not make these predictions public so as “not to create a panic.” One 
seismologist received a letter from a woman asking him to tell her if he ever sent his 
children to see out-of-town relatives. From this example, Angier notes that people 
seem to prefer to believe that authorities are engaged in monstrous lying than to sim- 
ply admit that there is uncertainty in science. 

Political pollsters, for example, learn to live with this uncertainty, even though the 
public that they serve is not comfortable with it. After the 2016 presidential election 
in the United States, pollsters took a lot of flack for their faulty predictions. Actually 
though, the pollsters were pretty close in calling the popular vote. What they failed to 
predict correctly was the outcome in the electoral college. Pollster and statistician Nate 
Silver was particularly victimized by public misunderstanding of probabilistic pre- 
diction. Close to the election, his prediction was a 71 percent probability that Hillary 
Clinton would win the electoral college. Democrats were furious with him because 
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most other pollsters were setting Clinton’s chance of winning the electoral college at 
over 90 percent (Flint & Albert, 2016; Hemingway, 2016; Lohr & Singer, 2016). One 
Princeton poll had the probability of a Clinton electoral college win at 99 percent! 
Democratic websites accused Silver of skewing his analysis in Donald Trump’s favor. 
Of course, after the election, Silver got little credit, because he had still predicted the 


wwEAne winvem die secaiued ye neigen dealin iprepzegiding a probabilistic 
Virtually all the facts and relationships that have been uncovered by the science 
of psychology are stated in terms of probabilities. There is nothing unique about this. 
Many of the laws and relationships in other sciences are stated in probabilities rather than 
certainties. The entire subdiscipline of population genetics, for example, is based on prob- 
abilistic relationships. Physicists tell us that the distribution of the electron’s charge in an 
atom is described by a probabilistic function. Thus, the fact that behavioral relationships 
are stated in probabilistic form does not distinguish them from those in other sciences. 


“Person-Who’” Statistics 


[Ss 
. Smoking 
causes lung cancer and a host of other health problems. Voluminous medical evidence 


documents this fact (Gigerenzer et al., 2007). Yet will everyone who smokes get lung 
cancer, and will everyone who refrains from smoking be free of lung cancer? Most 


people know that these implications do not follow. The relationship is probabilistic. 


It cannot tell us which ones will die, though.) a 
nn. We are all aware of this—or are we? How often have we seen 
a nonsmoker trying to convince a smoker to stop by citing the smoking—lung-cancer 


statistics, only to have the smoker come back with “Oh, get outta here! Look at old Joe 
Ferguson down at the store. Three packs of Camels a day since he was sixteen! Ninety- 


one years old and he looks great!” The obvious inference that one is supposed to draw 
is that this single case somehow invalidates the relationship. 


It is surprising and distressing how often this ploy works.” 00000 
Ss 
a ee 
|. If people think a single example can invalidate a law, they must feel the law 
should hold in every case. In short, they have failed to understand the law’s probabilistic 
nature. There will alwavs be a “person who” goes against even the strongest of trends. 


Psychologists call instances like the “old Joe Ferguson” story examples of the use of 


(eee 
nn. The ubiqui- 


tous “person who” is usually trotted out when we are confronted with hard statistical 
evidence that contradicts a previously held belief. Thus, it could be argued that, 


7 
mn Weber RES psy tenbpigie on innate ate frets 
human decision making and reasoning suggests that the tendency to use the “person 
who” comes not simply from its usefulness as a debating strategy. Instead __ 


BE. It is so much of an Achilles’ heel that probabilistic reasoning is at the heart of 


the operational definition of human rationality (Stanovich, West, & Toplak, 2016). 
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Probabilistic Reasoning and the 
Misunderstanding of Psychology 


The findings of psychology are often misunderstood because of the problems people 


have in dealing with probabilistic information. 2222000000000 
Bo). Most people under- 


stand the statement “smoking causes lung cancer” in the same way (although old 
“Joe Ferguson” can be convincing to some smokers who do not want to believe that 


their habit may be killing them!). However, 0000000000 


Ve Most psychology instructors have 


witnessed a very common reaction when they discuss the evidence on certain behav- 
ioral relationships. For example, the instructor may present the fact that children’s 
scholastic achievement is related to the socioeconomic status of their households and 
to the educational level of their parents. This statement often prompts at least one stu- 
dent to object that he has a friend who is a National Merit Scholar and whose father 
finished only eighth grade. Even those who understood the smoking—lung-cancer 


oxo tend to waver at this pou 


Most people understand that many treatments, theories, and facts developed by medical 
science are probabilistic. They understand that, for example, a majority of patients, but 
not all of them, will respond to a certain drug. Medical science, however, often cannot tell 
in advance which patients will respond. Often all that can be said is that if 100 patients 
take treatment A and 100 patients do not, after a certain period the 100 patients who 
took treatment A will collectively be better off. I mentioned in an earlier chapter that I 
take a medication called Imitrex (sumatriptan succinate) for relief from migraine head- 
aches. The information sheet accompanying this drug tells me that controlled studies 
have demonstrated that, at a particular dosage level, 57 percent of patients taking this 
medication receive relief in two hours. I am one of the lucky 57 percent—but neither the 


e EAR AS BOYER. ANEA SBRICHBINEOPE have easel I would not be one of 


(we will discuss actuarial pre- 
diction in more detail in the next chapter). 

Consider an unhealthy person going to a physician. The person is told that 
unless he or she exercises and changes diet, he or she has a high risk of heart attack. 
We are not tempted to say that the doctor has no useful knowledge because he or 
she cannot tell the person that without a change of diet he or she will have a heart 
attack on September 18, 2024. We tend to understand that the physician’s predictions 
are probabilistic and cannot be given with that level of precision. It is likewise when 
geologists tell us that there is a 60 percent probability of a magnitude 7.0 or greater 
earthquake in a certain area in the next 30 years (Silver, 2012). We do not denigrate 
their knowledge because they cannot say that there will be an earthquake exactly 
here on July 5, 2023. 
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Yet not everyone fully understands this. In April of 2009, an earthquake 
occurred in L’Aquila, Italy, and killed 309 people (Diacu, 2012; Radford, 2016b). It 
injured over 1,500 people. Incredibly, in 2012, an Italian court levied a criminal con- 
viction against six of the country’s seismologists for not accurately predicting the 
earthquake! The conviction was overturned in 2016, but it demonstrates how dif- 


fied tite fonbaepaeblicsandaeonetings ibe saussp napdsretendshehundamentl 
cases (Silver, 2012). 
ee ee es 


This likewise when 


a clinical psychologist recommends a program for a child with self-injurious behavior. 
The psychologist judges that there is a higher probability of a good outcome if a certain 
approach is followed. But unlike the heart attack and earthquake examples, the psycholo- 
gist is often confronted with questions like “but when will my child be reading at grade 
level?” or “exactly how long will he have to be in this program?” These are unanswerable 
questions—in the same way that the questions about exactly when the earthquake or the 
heart attack will occur are also unanswerable questions. They are unanswerable because 
in all these cases—the heart attack, the learning disabled child, the earthquake, the child 
with self-injurious behavior—the prediction being made is probabilistic. 


For these reasons, |) Sn 
ae There is a profound irony here. DI 
SY: BEE 


Psychological Research 
on Probabilistic Reasoning 


In the past three decades, the research of psychologists such as Daniel Kahneman of 


RNN OLHO Ei dwn ss ofthe Nebel Pepin Rba theless OER, 
In the course of their studies, these investigators have uncovered some fundamental 
principles of probabilistic reasoning that are absent or, more commonly, insufficiently 
developed in many people. As has often been pointed out, it should not be surprising 
that they are insufficiently developed. As a branch of mathematics, probability theory 
is a very recent development. The key initial developments did not occur until the 
sixteenth and seventeenth centuries (Hand, 2014; Mazur, 2016), and many essential 


developments date not much past the twentieth century. 

The dates of the initial developments in probability theory highlight a significant 
fact: Games of chance existed centuries before the fundamental laws of probability 
were discovered. Here is another example of how personal experience does not seem 
to be sufficient to lead to a fundamental understanding of the world (see Chapter 7). 
It took formal study of the laws of probability to reveal how games of chance work. 


Thousands of gamblers and their “personal experiences” throughout history were 
insufficient to uncover the underlying nature of games of chance. 


The problem is that 33 
ee. 
a a a 
E 


“Why did they raise my insurance rate,” you might wonder, “and why is John’s 
rate higher than Bill’s? Is Social Security going broke? Is our state lottery crooked? Is 
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crime increasing or decreasing? Why do doctors order all those tests? Why can people 
be treated with certain rare drugs in Europe and not in the United States? Do women 
really make less than men in comparable jobs? Do international trade deals cost 
Americans jobs and drive down wages? Is educational achievement in Japan really 
higher than here? These are all good questions—concrete, practical questions about 


Que praised aikangrks- To understand the answers to each of them, however, 


Clearly, a complete discussion of statistical and probabilistic thinking is beyond 
the scope of this book. We will, however, briefly discuss some of the more common 
pitfalls of probabilistic reasoning. A good way to start developing the skill of probabi- 
listic thinking is to become aware of the most common fallacies that arise when people 
reason statistically. 


Insufficient Use of Probabilistic Information 


One finding that has been much replicated is hat = 
a Fein i a 
_ (the vividness problem discussed in Chapter 4), 0... 
[EB i re ee u eee 
1. Here is a problem (see Stanovich, 2010) that even experienced 
decision makers such as physicians find difficult: Imagine that the virus that causes 
AIDS (HIV) occurs in 1 in every 1,000 people. Imagine also that there is a test to diag- 
nose the disease that always indicates correctly that a person who has HIV actually has 
it. Finally, imagine that the test has a false-positive rate of 5 percent. This means that the 
test wrongly indicates that HIV is present in 5 percent of the cases in which the person 
does not have the virus. Imagine that we choose a person randomly and administer 
the test and that it yields a positive result (indicates that the person is HIV-positive). 
What is the probability that the individual actually has the HIV virus, assuming that 
we know nothing else about the individual’s personal or medical history? 

The most common answer to this problem (even among experienced physicians) 
is 95 percent. The correct answer is approximately 2 percent. People vastly overesti- 
mated the probability that a positive result truly indicated the disease because of the 


tendency to overweight the case information and underweight the base rate informa- 
tion (that only 1 in 1,000 people are HIV-positive). A little logical reasoning can help 


to illustrate the profound effect that base rates have on probabilities. Of 1,000 people, 
only 1 will actually be HIV-positive. If the other 999 (who do not have the disease) 
are tested, the test will indicate incorrectly that approximately 50 of them have the 
virus (0.05 multiplied by 999) because of the 5 percent false-positive rate. Thus, of the 
51 patients testing positive, only 1 (approximately 2 percent) will actually be HIV- 


positive. IN snort, the Dase rate 15 such that the vast Majority OT people AO not Nave 
the virus (only 1 in 1,000). This fact, combined with a substantial false-positive rate, 
ensures that, in absolute numbers, the vast majority of positive tests will be of people 
who do not have the virus. 

Although most people recognize the correctness of this logic, their initial tendency 
is to discount the base rates and overweight the clinical evidence. In short,; | 


"ohh 2017) 


| 
en. 

In this problem, the case evidence (the laboratory test result) seems tangible and 
concrete to most people, whereas the probabilistic evidence seems, well—probabilis- 
tic. This reasoning, of course, is fallaciousbecause 00000000000 S 
000. Aclinical test misidentifies the presence of a disease with a certain probability. 
The situation is one in which two probabilities—the probable diagnosticity of the case 
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evidence and the prior probability (base rate)—must be combined if one is to arrive at 
a correct decision. There are right and wrong ways of combining these probabilities, 
and more often than not—particularly when the case evidence gives the illusion of 
concreteness (recall our discussion of the vividness problem in Chapter 4)—people 
combine the information in the wrong way. 


false BEE PRE RK AP ANG Ale illustrates the mp ontance.of Pay me attention fate 
positive rate (5 percent) combined with a low base rate for the disease (only 1 in 1,000) 
resulted in the following consequence: More people with a positive test result did not 
have the disease than did have it. Attention to false-positives is a critical concern in 
all diagnostic testing, including in medicine where, despite great advances in treat- 
ment and diagnosis, most clinical tests still have substantial false-positive rates. In 
one study of 30,000 older men, it was found that after taking four screening tests for 
prostate, lung, and colorectal cancer, more than one-third of the men received a false- 
positive result—the test indicated that they had cancer when in fact they were cancer 
free (Croswell et al., 2009). 


Failure to Use Sample-Size Information 
Consider these two problems (see Kahneman, 2011): 


1. A certain town is served by two hospitals. In the larger hospital, about 45 babies 
are born each day, and in the smaller hospital, about 15 babies are born each 
day. As you know, about 50 percent of all babies are boys. However, the exact 
percentage varies from day to day. Sometimes it is higher than 50 percent, some- 
times lower. For a period of one year, each hospital recorded the days on which 
more than 60 percent of the babies born were boys. Which hospital do you think 
recorded more such days? 

a. The larger hospital 
b. The smaller hospital 
c. About the same 


2. Imagine an urn filled with balls, two-thirds of which are of one color and one- 


fbirdpfnsbicherect anathes Wrrdndinitual haadiawa hals komsbeune and 
found that 12 are red and 8 are white. Which of the two individuals should feel 


more confident that the urn contains two-thirds red balls and one-third white 
balls, rather than vice versa? What odds should each individual give? 


In problem 1, the majority of people answer “about the same.” People not choosing 
this alternative pick the larger and the smaller hospital with about equal frequency. 
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Because the correct answer is the smaller hospital, approximately 75 percent of sub- 
jects given this problem answer incorrectly. These incorrect answers result from an 
inability to recognize the importance of sample size in the problem. Other things being 
equal, a larger sample size always more accurately estimates a population value. Thus, 
on any given day, the larger hospital, with its larger sample size, will tend to have a 
proportion of births closer to 50 percent. Conversely, a small sample size is always 


mors likly so deniate He NS ORRBH SF oR al uE SENS ihe smaller hospital will haye 
population value (60 percent boys, 40 percent boys, 80 percent boys, etc.). 

In problem 2, most people feel that the sample of 5 balls provides more convincing 
evidence that the urn is predominantly red. Actually, the probabilities are in the oppo- 
site direction. The odds are 8 to 1 that the urn is predominantly red for the 5-ball sam- 
ple, but they are 16 to 1 that the urn is predominantly red for the 20-ball sample. Even 
though the proportion of red balls is higher in the 5-ball sample (80 percent versus 
60 percent), this is more than compensated for by the fact that the other sample is four 
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times as large and, thus, is more likely to be an accurate estimate of the proportions 
in the urn. The judgment of most subjects, however, is dominated by the higher pro- 
portion of red in the 5-ball sample and does not take adequate account of the greater 
reliability of the 20-ball sample. 


maler, Same) WH Se yays, generate, More extreme values, Psychologist Daniel 


send us on a wild goose chase in search a causal theories when none are needed. He 


pointed out that a study of 3,141 counties in the United States found that the coun- 
ties in which the incidence of kidney cancer was lowest tended to be rural counties 
that were sparsely populated. Kahneman (2011) pointed out how easy it would be 
to come up with a causal theory about why this was the case: “the clean living of 
the rural lifestyle—no air pollution, no water pollution, access to fresh food with- 
out additives” (p. 109). The only problem with this causal theory is that it does not 
account for another finding from the same study: The counties in which the incidence 
of kidney cancer was highest tended to be rural counties that were sparsely popu- 
lated! Had we been told this last fact first, we might have started to posit explana- 
tions of rural counties having more smoking, drinking, and high-fat diets. But this, 
and the earlier explanation for the low-incidence counties, would both have been off 
the mark. What we have here is the hospital problem discussed previously playing 
out in real life. Rural counties with sparse populations are small samples, and they 
are bound to produce more extreme values of all types—extremely high values and 
extremely low values. 

Many people have problems recognizing that they are in situations involving 
sampling. That is, they have difficulty realizing that they are dealing with a sample 
rather than the entire entity. Failure to realize this leads them to miss the fact that `. 
es For example, when a blood 
test is ordered by your physician, what is taken from you will be a sample and it 
will be assessed, not the state of your entire blood system... 8: 
Dee 
because 


the cells in the sample and their composition and properties will necessarily devi- 


ate a little bit from absolute truth because the test cannot measure your entire blood 
system. In short, your physician is making assumptions about your entire composi- 


tion from a tiny sample. 

It is likewise when a tumor is biopsied. There is some error involved, because the 
biopsy yields only a small sample from a larger tumor. Medical writer Tara Parker- 
Pope (2011), in discussing the biopsy done for suspected prostate cancer, informs us 
that a very common type of biopsy samples only about one three-thousandth of the 


prostate. She cites evidence that staging and grading mistakes occur in about 20 per- 
cent of specimens. The point to realize is that it is the same when we are measuring 
behavior. We often are taking a small sample to represent a much larger population 
of behavior. 


The Gambler’s Fallacy 


Please answer the following two problems: 


Problem A: Imagine that we are tossing a fair coin (a coin that has a 50/50 chance 
of coming up heads or tails) and it has just come up heads five times in a row. For 
the sixth toss, do you think that 

It is more likely that tails will come up than heads? 

It is more likely that heads will come up than tails? 


— 
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Problem B: When playing slot machines, people win something one out of every 
10 times. Julie, however, has just won on her first four plays. What are her chances 
of winning the next time she plays? out of 


These two problems probe whether a person is prone to the so-called (I 


eee epee. Most games of 
chance that use proper equipment have this property. For exampl, 
LLL LL epee eee eee et Half the 


numbers on a roulette wheel are red, and half are black (for purposes of simplification, 
we will ignore the green zero and double zero), so the odds are even (0.50) that any given 
spin will come up red. Yet after five or six consecutive reds, many bettors switch to 


black, thinking that it is now more likely to come up. Thisis > nn 
Een a E 
2... In this case, the bettors are wrong in their belief. The roulette wheel has 


no memory of what has happened previously. Even if 15 reds in a row come up, the 
probability of red coming up on the next spin is still 0.50. 

In problem A, some people think that it is more likely that either heads or tails 
will come up after five heads, and they are displaying the gambler’s fallacy by 
thinking s. 
ee a Ei 1) 
a. 

The gambler’s fallacy is not restricted to the inexperienced. Research has 
shown that even habitual gamblers, who play games of chance over 20 hours a 
week, still display belief in the gambler’s fallacy (Petry, 2005). In fact, 


DT 
a ae FE A 
(Toplak et al., 2007). 

It is important to realize that is not restricted to games 
of chance. Tt 
nr The genetic makeup of babies is an example. Psychologists, 


physicians, and marriage counselors often see couples who, after having two female 
children, are planning a third child because “We want a boy, and it’s bound to be a 


boy this time.” This, of course, is the gambler’s fallacy. The probability of having a 
boy (approximately 50 percent) is exactly the same after having two girls as it was in 
the beginning. The two previous girls make it no more likely that the third baby will 
be a boy. 

The gambler’s fallacy stems from many mistaken beliefs about probability. One 


is the belief that if a process is truly random, no sequence—not even a small one (six 
coin flips, for instance)—should display runs or patterns. People routinely underesti- 
mate the likelihood of runs (HHHH) and patterns (HHTTHHTIHHTT) in a random 
sequence. For this reason, people cannot generate truly random sequences when they 
try to do so. The sequences that they generate tend to have too few runs and patterns. 


When generating such sequences, people alternate their choices too much in a mis- 


taken den ip.destnoy any structure that might appear (Fischer & Savranevski, 2015; 


Those who claim to have psychic powers can easily exploit this tendency. Consider 
a demonstration sometimes conducted in college psychology classes. A student is told 
to prepare a list of 200 numbers by randomly choosing from the numbers 1, 2, and 3 
over and over again. After it is completed, the list of numbers is kept out of view of 
the instructor. The student is now told to concentrate on the first number on the list, 
and the instructor tries to guess what the number is. After the instructor guesses, the 
student tells the class and the instructor the correct choice. A record is kept of whether 
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the instructor’s guess matched, and the process continues until the complete record of 
200 matches and nonmatches is recorded. Before the procedure begins, the instructor 
announces that she or he will demonstrate “psychic powers” by reading the subject’s 
mind during the experiment. The class is asked what level of performance—that is, 
percentage of “hits”—would constitute empirically solid evidence of psychic powers. 


IE Could be Sxpected purely on the Bans of chance: Ke metracior would 
have to achieve a larger proportion than this, probably at least 40 percent, before one 
should believe that she or he has psychic powers. The class usually understands and 
agrees with this argument. The demonstration is then conducted, and a result of more 
than 40 percent hits is obtained, to the surprise of many. 

The students then learn some lessons about randomness and about how easy it 
is to fake psychic powers. The instructor in this example merely takes advantage of 
the fact that people do not generate enough runs: They alternate too much when pro- 
ducing “random” numbers. In a truly random sequence of numbers, what should the 
probability of a 2 be after three consecutive 2s? One-third, the same as the probability 
ofalor a3. But this is not how most people generate such numbers. After even a small 
run, they tend to alternate numbers in order to produce a representative sequence. 
Thus, on each trial in our example, the instructor merely picks one of the two numbers 
that the student did not pick on the previous trial. Thus, if on the previous trial the 
student generated a 2, the instructor picks a 1 or a 3 for the next trial. If on the previ- 
ous trial the subject generated a 3, the instructor picks a 1 or a 2 on the next trial. This 
simple procedure usually ensures a percentage of hits greater than 33 percent—greater 
than chance accuracy without a hint of psychic power. 


was illustrated humorously in the controversy over the 


iPod’s “shuffle” feature that broke out in 2005 (Levy, 2005; Froelich et al., 2009). 
This feature plays the songs loaded into the iPod in a random sequence. Of course, 
knowing the research I have just discussed, many psychologists and statisticians 
chuckled to themselves when the inevitable happened—users complained that the 
shuffle feature could not be random because they often experienced sequences of 
songs from the same album or genre (Ziegler & Garfield, 2012). Technical writer 


Steven Levy (2005) described how he had experienced the same thing. His iPod 
seemed always to have a fondness for Steely Dan in the first hour of play! But Levy 


was smart enough to accept what the experts told him: 7 000000000 
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These, then, are just a few of the shortcomings in statistical reasoning that obscure an 
understanding of psychology. More complete and detailed coverage is provided in 
Kahneman’s Thinking, Fast and Slow (2011). Introductions to many of these ideas (and 
good places to start for those who lack extensive statistical training) are contained in 
Hastie and Dawes’s Rational Choice in an Uncertain World (2010), Baron’s Thinking and 
Deciding (2008), my own Decision Making and Rationality in the Modern World (2010), 
Charles Wheelan’s Naked Statistics: Stripping the Dread from Data (2013), and Jordan 
Ellenberg’s How Not to Be Wrong: The Power of Mathematical Thinking (2014). 

The probabilistic thinking skills discussed in this chapter are of tremendous 
practical significance. Because of inadequately developed probabilistic thinking abil- 
ities, physicians choose less effective medical treatments (Croskerry, 2013); people 
fail to assess accurately the risks in their environment (Fischhoff & Kadvany, 2011); 
information is misused in legal proceedings (Gigerenzer et al., 2007); unnecessary 
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surgery is performed (Gigerenzer et al., 2007; Groopman, 2007); and costly financial 
misjudgments are made (Lewis, 2017; Thaler, 2015; Zwieg, 2008). 

Of course, a comprehensive discussion of statistical reasoning cannot be carried 
out in a single chapter. Our goal was much moremodest: = 
nn, + Unfortunately, there is no 


simple rule to follow when confronted w jth statistical info Tmation. in 
r components of scientific ct inking th at are more easi ly acquire 


DE (Evans, 2015). A past president of the Association for Psychological 
Science, Morton Ann Gernsbacher (2007), derived a list of 10 things of intellectual 
value that she thinks psychological training specifically instills, and 4 of her 10 were 
in the domains of statistics and methodology. 


(Parry, 2012). 

Ludy Benjamin, winner of a prestigious APA teaching award, discussed the 
most important features that he says should be in an introductory psychology class. 
While acknowledging that of course such a class must present the most important 


findings in the discipline, Benjamin went on to say that he thought that “in the 


long rum 


(Dingfelder, 2007, p. 26). 

This legacy is much valued in the real world outside of the academic psychology 
department. Money Magazine listed the 21 most valuable career skills in their survey 
of business and industry (Weisser et al., 2016) and the list was chock-full of statistics 
and data-analysis skills (data mining, forecasting, facility in statistical software, data 
modeling, etc.). And if you are a psychology major reading this, you should note 
that there are more and more CEOs out there like Boaz Salik (2016). In interviews, he 
asks job applicants to his consulting firm how to calculate the joint probability of two 


events! The arestian chaiuld he a niece af cake far anv nevchalooy maior hit micht 
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trip up anyone not comfortable with statistics and probability. 


Our current world is awash in statistics and graphic displays of numbers. In medi- 
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cine, finance, advertisements, and on the news, we are presented with claims based on 
statistics (Silver, 2012). We need to learn to evaluate them, and fortunately = 


Clearly, 


one of the goals of this book is to make research in the discipline of psychology more 


ec ues 9 
as 1s the case ın 


many other fields, such as economics, sociology, and genetics) SS 


REES er eee ce en eee Er 
_ _» Thus, although this chapter has served as an extremely brief lesson in 


statistical thinking, its main purpose has been to highlight the existence of an area of 
expertise that is critical to a full understanding of psychology. 
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Summary 
As in most science, aa 


probabilistic reasoning goes astray for many people: They 
make insufficient use of probabilistic information when 
they also have vivid testimonial evidence available; they 
fail to take into account the fact that larger samples give 


more accurate estimates of population values; and, finally, 


oo just as is the case 


in other sciences). One thing that prevents the understand- 
ing of much psychological research is that many people 
have difficulty thinking in probabilistic terms. In this chap- 
ter, we discussed several well-researched examples of how 


they display the gambler’s fallacy (the tendency to see links 
among events that are really independent). The gambler’s 
fallacy derives from a more general tendency that we will 
discuss in the next chapter: the tendency to fail to recognize 
the role of chance in determining outcomes. 


Chapter 11 


IpPsyelfond & hance 


Learning Objectives 


11.1 Explain why chance impedes interpretation of scientific evidence 
11.2 Explain how people misunderstand the meaning of coincidence 


11.3 Differentiate between actuarial and clinical predictions 


In the last chapter, we discussed the importance of probabilistic trends, probabilistic 
thinking, and statistical reasoning. In this chapter, we will continue that discussion 
with an emphasis on the difficulties of understanding the concepts of randomness and 
chance. We will emphasize how people often misunderstand the contribution of re- 
search to clinical practice because of a failure to appreciate how thoroughly the concept 
of chance is integrated within psychological theory. 


The Tendency to Try to Explain 
Chance Events 


. This strong tendency to search for structure has been stud- 


ied by psychologists. 


Nevertheless, 000000000 


2000005. The quest for conceptual understanding is maladaptive when it takes place 
in an environment in which there is nothing to conceptualize. What plays havoc with 


one of the most distinguishing features of human cognition? What confounds our 
quest for structure and obscures understanding? You guessed it: probability. Or, more 
specifically, chance and randomness. 


Again, recall a previous example: Smoking causes lung cancer. A systematic, explain- 


able aspect of biology links smoking to this particular disease. But not all smokers 
contract lung cancer. The trend is probabilistic. Perhaps we will eventually be able to 


explain why some smokers do not contract cancer. However, for the time being, this 
variability must be ascribed to the multitude of chance factors that determine whether 
a person will contract a particular disease. 
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As this example illustrates, (i 


A coin toss is a chance event, but not because it is in principle impossible to determine 
the outcome by measuring the angle of the toss, the precise composition of the coin, 
and many other variables. In fact, the outcome of a toss is determined by all these vari- 


|. Often, however, 


Psychologists have conducted experiments on this phenomenon. In one experimen- 


tal situation, subjects view a series of stimuli that vary in many different dimensions. 
The subjects are told that some stimuli belong to one class and other stimuli belong to 
another. Their task is to guess which class each of a succession of stimuli belongs to. 
However, the researcher actually assigns the stimuli to classes randomly. Thus, there 
is no rule except randomness. The subjects, however, rarely venture randomness as 


agiaarrortend than teosancdtsrstemely elaborate and complicated theories to 


The thinking of many financial analysts illustrates how difficult it is to acknowl- 
edge the large effect of randomness in certain domains. It is common for financial ana- 
lysts to concoct elaborate explanations for every little fluctuation in stock market prices. 
In fact, much of this variability is simply random fluctuation (Ellis, 2016; Kahneman, 
2011). What we should be hearing many nights on television is something like “The 
Dow Jones average gained 27 points today because of random fluctuation in a complex 
interacting system.” You will never hear this headline, because financial analysts want 
to imply that they can explain everything—every little burp in market behavior. They 
continue to imply to their customers (and perhaps themselves believe) that they can 
“beat the market” when there is voluminous evidence that the vast majority of them 
can do no such thing. Throughout most of the last several decades, if you had bought 
all of the 500 stocks in the Standard and Poor’s Index and simply held them (what we 
might call a no-brain strategy—a strategy you could actually carry out by buying a 
mutual fund that tracks that index), then you would have had higher returns than over 
three quarters of the money managers on Wall Street (Bogle, 2015; Ellis, 2016; Investor’s 
Guide, 2017; Malkiel, 2016). You would also have beaten 80 percent of the financial 
newsletters that subscribers buy at rates of up to $1,000 per year. 

But what about the managers who do beat the no-brain strategy? You might be 
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wondering whether this means that they have some special skill. We can answer that 
question by considering the following thought experiment. One hundred monkeys 
have each been given ten darts, and they are each going to throw them at a wall con- 
taining the names of each of the Standard and Poor’s 500 stocks. Where the darts land 
will define that monkey’s stock picks for the year. How will they do a year later? How 
many will beat the Standard and Poor’s 500 Index? You guessed it. Roughly half of the 
monkeys. Would you be interested in paying the 50 percent of the monkeys who beat 
the index a commission to make your picks for you next year? . 
The logic by which purely random sequences seem to be the result of predict- 
able factors is illustrated by a continuation of this example of financial predictions. 
Imagine that a letter comes in the mail informing you of the existence of a stock-market- 
prediction newsletter. The newsletter does not ask for money but simply tells you to 
test it out. It tells you that IBM stock is going to go up during the next month. You put 
the letter away, but you do notice IBM stock does go up the next month. Having read 
a book like this one, however, you know better than to make anything of this result. 
You chalk it up to a lucky guess. Subsequently, you receive another newsletter from the 


same investment-advice company telling you that IBM stock will go down the follow- 
ing month. When the stock does go down, you again chalk the prediction up to a lucky 
guess, but you do get a bit curious. When the third letter from the same company comes 
and predicts that IBM will go down again the next month, you do find yourself watch- 
ing the financial pages a little more closely, and you confirm for the third time that the 
newsletter’s prediction was correct. IBM has gone down this month. When the fourth 
newsletter arrives from the same company and tells you that the stock will rise the 
next month, and it actually does move in the predicted direction for the fourth time, it 
becomes difficult to escape the feeling that this newsletter is for real—difficult to escape 
the feeling that maybe you should send in the $29.95 for a year’s worth of the news- 
letter. Difficult to escape the feeling, that is, unless you can imagine the cheap base- 
ment office in which someone is preparing next week’s batch of 1,600 newsletters to be 
sent to 1,600 addresses: 800 of the newsletters predict that IBM will go up during the 
next month, and 800 of the newsletters predict that IBM will go down during the next 
month. When IBM does go up, that office sends out letters to only the 800 addressees 
who got the correct prediction the month before (400 predicting that the stock will go 
up in the next month and 400 predicting that it will go down, of course). Then you can 
imagine the “boiler room”— probably with telemarketing scams purring on the phones 
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month and 200 predicting that it will go down). Yes, you were one of the lucky 100 who 
received four correct random predictions in a row! Many of these lucky 100 (and prob- 
ably very impressed) individuals will pay the $29.95 to keep the newsletters coming. 

Now this seems like a horrible scam to play on people. And indeed it is. But it is 
no less of a scam than when “respectable” financial magazines and TV shows pres- 
ent to you the “money manager who has beaten more than half his peers four years 
in a row!” Again, think back to our monkeys throwing the darts. Imagine that they 
were money managers making stock picks year after year. By definition, 50 percent of 
them will beat their peers during the first year. Half of these will again—by chance— 
beat their peers in the second year, making a total of 25 percent who beat their peers 
two years in a row. Half of these will again—by chance—beat their peers in the third 
year, making a total of 12.5 percent who beat their peers three years in a row. And 
finally, half of these 12.5 percent (i.e., 6.25 percent) will again beat their peers in the 
fourth year. Thus, about 6 of the 100 monkeys will have, as the financial shows and 
newspapers say, “consistently beaten other money managers for four years in a row!” 
These six monkeys who beat their dartboard peers (and, as we just saw, would beat 
a majority of actual Wall Street money managers; Ellis, 2016; Malkiel, 2016) certainly 
deserve spots on the financial television programs, don’t you think? 
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Explaining Chance: Illusory Correlation 
and the Illusion of Control 


3. In short, 


(Kahneman, 2011; Whitson & Galinsky, 2008). 


Controlled studies have demonstrated that 7,00 nn 
nn. Unfortunately, this finding generalizes to some 


real-world situations that adversely affect people’s lives. For example = 
a. This is 
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the famous inkblot test in which the subject responds to blotches on a white paper. 


The problem with all of this is that 


(Lilienfeld et 


al., 2010, 2012). 
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Many of the interpersonal encounters in our lives have a large amount of chance 
in them: the blind date that leads to marriage, the canceled appointment that causes 
the loss of a job, the missed bus that leads to a meeting with an old high school friend. 
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5. Psychologists have studied what has been 
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spread nature of this fallacy comes from the experience of states in which lotteries 
have been instituted. These states are descended on by purveyors of bogus books 
advising people how to “beat” the lottery—books that sell because people do not 
understand the implications of randomness. In fact, the explosion in the popularity 
of state lotteries in the United States did not occur until the mid-1970s, when New 
Jersey introduced participatory games in which players could scratch cards or pick 
their own numbers. 


Chance and Psychology 


In psychology, the tendency to try to explain everything, to have our theories account 


CGV EGY Bea tather than just the systema OB RT 
personal theories and those that are ostensibly scientific. 


a, The television talk-show 
guest (Chapter 4) who has an answer for every single case, for every bit of human 
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Coincidence 


The tendency to seek explanations for essentially chance occurrences leads to much 
misunderstanding regarding the nature of coincidental events. 


Most dictionary definitions of the word “coincidence” interpret it to refer to an 
accidental remarkable occurrence of related events. Because the same dictionaries 
define accidental as “occurring by chance,” there is no problem here. 


. Instead, they seek elaborate theories 
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in order to understand these events, 2224000000000 


If people truly understood what coincidence meant (a remarkable occurrence that 
is due to chance), they would not fall prey to the fallacy of trying to develop sys- 
tematic, nonchance explanations for these chance events. Yet, = 


E For 


example, most of us have heard statements like “My goodness, what a coincidence! I 


ppened!” This reflects a fundamental error EEE do not 


Psychologist MES (2001) "ee 
ee. One thing that 


contributes to the tendency to search for explanations of coincidental events is the mis- 
taken idea that rare events never happen, that oddmatches are never due to chance. 
Our belief in this fallacy is intensified because probabilities are sometimes stated in 
terms of odds and because of the connotations that such statements have. Think of 
how we phrase the following: “Oh, goodness, that’s very unlikely. The odds are 100 
to 1 against that happening!” The manner in which we articulate such a statement 
strongly implies that it will never happen. Of course, we could say the same thing in a 
very different way, one that has very different connotations: “In 100 events of this type, 
this outcome will probably happen once.” This alternative phrasing emphasizes that, 
although the event is rare, in the long run rare events do happen. 


If you flipped 5 coins all at once and they all came up heads, you would probably con- 
sider this result an oddmatch, an unlikely event. You would be right. The probability 
of this happening in any one flip of 5 coins is 1/32 or 0.03. But if vou flipped the 5 coins 
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100 times and asked how likely it is that in at least 1 of those 100 trials the coins would 
all come up heads, the answer would be 0.96. That is, in 100 trials, this rare event, this 
oddmatch, is very likely to happen. 

In short, 
1. In August 1913, in a casino in Monte Carlo (Kaplan & Kaplan, 2007), 
black came up ona roulette wheel 26 timesin a row! Or, take another example: If lotter- 
ies go on long enough, consecutive identical winning numbers are bound to be drawn 
eventually. For example, on June 21, 1995, in a German lottery called 6/49 (6 numbers 
are picked out of 49 possible) the numbers drawn were 15-25-27-30-42-48—exactly the 
same set of numbers that had been drawn on December 20, 1986 (Mlodinow, 2008). 
Many people were surprised to learn that over that time period the chance that some 
set of numbers would repeat was as high as 28 percent. 

There are websites devoted to the “spooky” fact that many famous musicians died 
at age 27: Amy Winehouse, Kurt Cobain, Jim Morrison, Jimi Hendrix, Janis Joplin, and 
so on (O’Connor, 2011). Except that there is nothing “spooky” about it. It is not a fact 
in need of explanation. It is, instead, a random occurrence. The reason we know this is 
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because of a statistical analysis published in the British Medical Journal of 1,046 musi- 
cians who had a No. 1 album on the British charts from 1956 to 2007 (Barnett, 2011). 
The analysis indicated that there is no tendency for star musicians to die dispropor- 
tionately at age 27. 


O, Cognitive 
psychologist Daniel Kahneman (2011) describes how during the Yom Kippur War in 


1973 he was approached by the Israeli Air Force for advice. Two squads of aircraft 
had gone out and one squad had lost four aircraft and the other had lost none. The 
Air Force wanted Kahneman to investigate whether there were factors specific to the 
different squadrons that were correlated with the outcome. But Kahneman knew that, 
with a sample this small, any such factors found would most likely be spurious— 
the result of mere chance fluctuation. Instead of doing a study, Kahneman used the 
insights in this chapter and told the Israeli Air Force not to waste their time. He says, 
“I reasoned that luck was the most likely answer, that a random search for a nonobvi- 
ous cause was hopeless, and that in the meantime the pilots in the squadron that had 
sustained losses did not need the extra burden of being made to feel that they and their 
dead friends were at fault” (p. 116). 


Personal Coincidences 


Oddmatches that happen in our personal lives often have special meaning to us and, 
thus, we are especially prone not to attribute them to chance. There are many reasons 
for this tendency. Some are motivational and emotional, but others are due to failures 
of probabilistic reasoning. We often do not recognize that oddmatches are actually just 
a small part of a much larger pool of “nonoddmatches.” It may seem to some of us that 
oddmatches occur with great frequency. But do they? 

Consider what an analysis of the oddmatches in your personal life would reveal. 
Suppose ona given day you were involved in 100 distinct events. This does not seem an 
overestimate, considering the complexity of life in a modern industrial society. Indeed, 
it is probably a gross underestimate. You watch television, talk on the telephone, meet 
people, negotiate the route to work or to the store, do household chores, take in infor- 


mation while reading, send and receive emails and texts, omplete complex tasks at 
work, and soon. ese events contain several components are Separately mem- 


orable. One hundred, then, is probably on the low side, but we will stick with it. An 
oddmatch is a remarkable conjunction of two events. How many possible different 
pairs of events are there in the 100 events of your typical day? Using a simple formula 
to obtain the number of combinations, we calculate that there are 4,950 different pair- 


ings of events possible in your typical day. ihis 1s true 565 days a year. 

Now, oddmatches are very memorable. You would probably remember for several 
years the day Uncle Bill called. Assume that you can remember all the oddmatches 
that happened to you in a ten-year period. Perhaps, then, you remember six or seven 
oddmatches (more or less, people differ in their criteria for oddness). What is the pool 
of nonoddmatches from which these six or seven oddmatches came? It is 4,950 pairs 
per day multiplied by 365 days per year multiplied by 10 years, or 18,067,500. In short, 
6 oddmatches happened to you in 10 years, but 18,067,494 things that could have been 
oddmatches also happened. The probability of an oddmatch happening in your life is 
0.00000033. It hardly seems strange that 6 out of 18 million conjunctions of events in 
your life should be odd. Odd things do happen. They are rare, but they do happen. 
Chance guarantees it (recall the example of simultaneously flipping five coins). In our 
example, six odd things happened to you. They were probably coincidences: remark- 
able occurrences of related events that were due to chance. Daniel Kahneman (2011) 
has argued that our language fails us here. We have terms for past thoughts that turned 
out to be true (premonition, intuition), but we have no words to mark and bring to our 


attention past beliefs that turned out to be false. Most people would not spontaneously 
think to say, “I had a premonition that the marriage would not last, but I was wrong” 
(p. 202), because somehow that would seem strange to them. Without a word to mark 
the occurrence, we are not prone to mark all of our past predictions that failed to occur. 


. The famous 
“birthday problem” provides a good example of this. In a class of 23 people, what is 


the probability that 2 of them will have their birthday on the same day? What is the 
probability in a class of 35 people? Most people think that the odds are pretty low. 
Actually, in the class with 23 people, the odds are better than 50-50 that 2 people will 
have birthdays on the same day. And in the class of 35 students, the odds are very high 
(the probability is over 0.80). Thus, because there have been 45 presidents of the United 
States, it is not surprising that 2 (James Polk and Warren Harding) were born on the 
same day (November 2). Nor is it surprising, because 39 presidents have died, that 2 
(Millard Fillmore and William Howard Taft) have died on the same day (March 8) and, 
furthermore, that three more (John Adams, Thomas Jefferson, James Monroe) have all 
also died on the same day (July 4!!). 


Accepting Error in Order to Reduce 
Error: Clinical Versus Actuarial 
Prediction 


The reluctance to acknowledge the role of chance when trying to explain out- 
comes in the world can actually decrease our ability to predict real-world events. 


ee But interestingly, DT 


oo Tt may seem paradoxical, but it is true that 


The concept that we must accept error in order to reduce error is illustrated by a 
very simple experimental task that has been studied for decades in cognitive psychol- 
ogy laboratories. The subject sits in front of two lights (one red and one blue) and is 
told that she or he is to predict which of the lights will be flashed on each trial and 
that there will be several dozen such trials (subjects are often paid money for correct 
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predictions). The experimenter has actually programmed the lights to flash randomly, 
with the provision that the red light will flash 70 percent of the time and the blue light 
30 percent of the time. Subjects do quickly pick up the fact that the red light is flash- 
ing more, and they predict that it will flash on more trials than they predict that the 
blue light will flash. In fact, they predict that the red light will flash approximately 
70 percent of the time. However, as discussed earlier in this chapter, subjects come 
to believe that there is a pattern in the light flashes and almost never think that the 
sequence is random. Instead, they switch back and forth from red to blue, predicting 
the red light roughly 70 percent of the time and the blue light roughly 30 percent of 
the time. Subjects rarely realize that—despite the fact that the blue light is coming on 
30 percent of the time—if they stopped switching back and forth and predicted the red 
light every time, they would actually do better! How can this be? 

Let’s consider the logic of the situation. How many predictions will subjects get 
correct if they predict the red light roughly 70 percent of the time and the blue light 
roughly 30 percent of the time and the lights are really coming on randomly ina ratio of 
70 to 30? We will do the calculation on 100 trials in the middle of the experiment—after 
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the subject has noticed that the red light comes on more often and is, thus, predicting 
the red light roughly 70 percent of the time. In 70 of the 100 trials, the red light will 
come on and the subject will be correct on about 70 percent of those 70 trials (because 
the subject predicts the red light 70 percent of the time). That is, in 49 of the 70 tri- 
als (70 times 0.70), the subject will correctly predict that the red light will come on. 
In 30 of the 100 trials, the blue light will come on, and the subject will be correct in 
30 percent of those 30 trials (because the subject predicts the blue light 30 percent of 
the time). That is, in 9 of the 30 trials (30 times 0.30), the subject will correctly predict 
that the blue light will come on. Thus, in 100 trials, the subject is correct 58 percent of 
the time (49 correct predictions on red light trials and 9 correct predictions on blue 
light trials). But notice that this is a poorer performance than could be achieved if the 
subject simply noticed which light was coming on more often and then predicted it in 
every trial—in this case, noticing that the red light came on more often and predicting 
it in every trial (let’s call this the 100 percent red strategy). Of the 100 trials, 70 would 
be red flashes, and the subject would have predicted all 70 of these correctly. Of the 
30 blue flashes, the subject would have predicted none correctly but still would have a 
prediction accuracy of 70-12 percent better than the 58 percent correct that the subject 
achieved by switching back and forth. 


he optimal strategy daes have an implication though that troubles some people— 
that in Same strategy willbe wrong every time a bitte occurs. ‘An since ham 
stimuli are occurring on at least some of the trials, to some people it just does not seem 


right never to predict them. | 


Accepting error in order to make fewer errors is a difficult thing to do, however, 
as evidenced by the 60-year history of research on clinical versus actuarial predic- 


predictions) that we discussed at the beginning of this chapter. 0. 


oo. More accurate predictions can be 
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Made lf we take more than one group Characteristic Into account (Using the complex 
correlational techniques mentioned in Chapter 5—specifically a technique known 
as multiple regression). For example, predicting a life span of 60.2 years for people 
who smoke, are overweight, and do not exercise would be an example of an actuarial 
prediction based on a set of variables (smoking behavior, weight, and amount of exer- 
cise), and such predictions are almost always more accurate than predictions made 
from a single variable. 


. For example, in studies pub- 
lished in the Journal of the American Medical Association and in the Annals of Internal 


Medicine the following probabilistic trends were reported: people who are obese in 
middle age are four times more likely than nonobese people to have heart problems 
after age 65; overweight (but not obese) people are twice as likely to develop kidney 
problems; and obese people are seven times more likely to develop kidney problems 
(Seppa, 2006). But probabilistic prediction admits error. Not all obese people will have 
health problems. Recall the case (from Chapter 10) of the political broadcaster Tim 


Russert who died of a heart attack at age 58. Physicians determined that Mr. Russert’s 
probability of a heart attack in the next ten years was only 5 percent. That is, most 
people (95 out of 100) with Mr. Russert’s profile would be heart-attack free for ten 
years. Mr. Russert was one of the unlucky 5 percent—he was an exception to the 
general trend. 

People sometimes find it difficult to act on actuarial evidence, however, because 
doing so often takes mental discipline. For example, in 2003 the Food and Drug 
Administration issued a health-advisory warning of a potential link between a popu- 
lar antidepressant drug and teen suicide. Many physicians worried that, on an actu- 
arial basis, the warning would result in more suicides. The physicians acknowledged 
that fewer teenagers would die of suicide because of the drug, but they warned that 
even more children would die because of an increased hesitancy to prescribe the drug. 
This is indeed what happened. Treatment with this drug can put children at a tempo- 
rary risk, but untreated depression is far worse. Most doctors thought that the warning 
would cost more lives than it would save (Dokoupil, 2007). That was the mathematics 
of the situation. Or perhaps we should say: That’s the calculus of actuarial prediction. 
But it can be a hard calculus to follow when folk wisdom is saying things like “better 
to be safe than sorry.” But in the domain of medical treatment “better to be safe than 
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were unavailable. 
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called clinical, or case, prediction. When engaged in clinical prediction, as opposed to 


actuarial prediction, professional psychologists claim to be able to make predictions 
about particular individuals that transcend predictions about “people in general” or 
about various categories of people. Clinical prediction would seem to be a very useful 
addition to actuarial prediction. There is just one problem, however. Clinical predic- 
tion doesn’t work. 


For clinical prediction to be useful, the clinician’s experience with the client and 
her or his use of information about the client would have to result in better predic- 
tions than we can get from simply coding information about the client and submitting 


it to statistical procedures. In short, vn 
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Research on the issue of clinical versus actuarial prediction has been consistent, 
and it has been going on for a long time. Since the publication in 1954 of Paul Meehl’s 
classic book Clinical Versus Statistical Prediction, decades of research consisting of over 
a hundred research studies have shown that, in just about every clinical prediction 
domain that has ever been examined (psychotherapy outcome, parole behavior, col- 
lege graduation rates, response to electroshock therapy, criminal recidivism, length of 
psychiatric hospitalization, and many more), 

(Kahneman, 2011; Lewis, 2017; Morera & Dawes, 2006; 
Tetlock & Gardner, 2015). It is for this reason that some states in the United States have 
begun to replace subjective parole boards with actuarial methods when making prison 
release decisions (Walker, 2013). 

In a variety of clinical domains, when a clinician is given information about a 
client and asked to predict the client’s behavior, and when the same information is 
quantified and processed by a statistical equation that has been developed based on 
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actuarial relationships that research has uncovered, the actuarial prediction is more 
accurate than the clinician’s prediction. In fact, even when the clinician has more infor- 
mation available than is used in the actuarial method, the latter is still superior. That 
is, when the clinician has information from personal contact and interviews with the 
client, in addition to the same information that goes into the actuarial equation, the 
clinical predictions still do not achieve an accuracy as great as the actuarial method. 
Here we have an example of failing to “accept error in order to reduce error” 
that is directly analogous to the light prediction experiment previously described. 
Rather than relying on the actuarial information that the red light came on more 
often and predicting red each time (and getting 70 percent correct), the subjects tried 
to be correct on each trial by alternating red and blue predictions and ended up 
being 12 percent less accurate (they were correct on only 58 percent of the trials). 
Analogously, the clinicians in these studies believed that their experience gave them 
“clinical insight” and allowed them to make better predictions than those that can be 
made from quantified information in the client’s file. In fact, their “insight” is non- 
existent and leads them to make predictions that are worse than those they would 
make if they relied only on the public, actuarial information. It should be noted, 
though, that the superiority of actuarial prediction is not confined to psychology 
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(pp. 373-374). M 


. For example, 


(Dana et al., 2013). Instead, 


One anti-actuarial argument that i is often raised is the old cliche that group sta- 
tistics do not apply to single individuals, or to single events. But this is a vague and 
imprecise statement. Does the person making this argument think that if one is forced 
to play Russian roulette a single time and is allowed to select a gun with one or five 
bullets in the chamber, that you might as well pick the five rather than the one? It’s a 
single, unique event, so it doesn’t matter, right? 
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the field as a whole would have little to lose, individual practitioners who engage in 


activities in the role of “experts” (i.e., in courtroom testimony) and imply that they 
have unique clinical knowledge of individual cases would, of course, lose prestige and 
perhaps income. 

In fact, the field, and society, would benefit if we developed the habit of “accept- 
ing error in order to reduce error.” In attempting to find unique explanations of every 
single unusual case (unique explanations that simply may not be possible given the 
present state of our knowledge), we often lose predictive accuracy in the more mun- 
dane cases. Recall the red-blue light experiment again. The “100 percent red strat- 
egy” makes incorrect predictions of all of the minority or unusual events (when the 
blue lights flash). What if we focused more on those minority events by adopting the 
“70-percent-red-30-percent-blue strategy”? We would now be able to predict 9 of those 
30 unusual events (30 times 0.30). But the cost is that we lose our ability to predict 
21 of the majority events. Instead of 70 correct predictions of red, we now have only 
49 correct predictions (70 times 0.70). Predictions of behavior in the clinical domain 
have the same logic. In concocting complicated explanations for every case, we may 
indeed catch a few more unusual cases—but at the cost of losing predictive accuracy in 
the majority of cases, where simple actuarial prediction would work better. 

Compulsive gamblers have a strong tendency not to “accept error in order to 
reduce error.” For example, blackjack players had a tendency to reject a strategy called 
basic that is guaranteed to decrease the casino’s advantage trom 6 or 8 percent to less 
than 1 percent. Basic is a long-term statistical strategy, and the compulsive players 
tend to reject it because they believe that the best strategy should work every time and 
be keyed to the specifics of the situation. Instead of using an actuarial strategy that was 
guaranteed to save them thousands of dollars, compulsive gamblers are on a futile 
chase to find a way to make a clinical prediction based on the idiosyncrasies of each 
specific situation. 

Another domain in which actuarial prediction often beats clinical prediction is 
sports. Many people saw the movie Moneyball in 2011, based on the book by Michael 
Lewis (2004). It told the story of Oakland A’s manager Billy Beane, who overruled 
the “clinical” judgments of his baseball scouts (who tended to rely heavily on visible 
physical characteristics) and relied on statistics of past performance when evaluating 
potential team members. His teams overperformed relative to the money they spent, 
and the actuarial methods that he had borrowed from baseball statisticians were then 
copied by many other teams. Statistical methods have been shown to be superior to 
“coaches’ judgments” in many other sports (see Moskowitz & Wertheim, 2011, for 
many examples). 


Of course, a S 
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information is highly useful in drawing attention to variables that are important and 
that need to be measured. 
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We will end this chapter, with the answer that psychologist Nicholas Epley (2013) 
gave to the interesting interview query: What’s the question about your field that you 
dread being asked? Epley picked the classic question that psychologists often get asked 
in casual conversation, “Are you analyzing me?” The question reflects the Freud problem 
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discussed in Chapter 1. But Epley went on to explain that it was another aspect of the 
question that bothered him more; what he deemed a “deeper issue,” and I agree. Epley 
claimed that the question “implies that I, as a psychologist, could indeed analyze you. 
[eee 
UU. In medicine, for instance, doctors prescribe drugs 
because the average outcome of those in the treatment group of a drug trial was better 
than the outcome of those in the placebo group. ..But as a psychologist, I often field ques- 
tions that call on me to offer more individualized answers than our science can warrant.” 


Summary 


The role of chance in psychology is often misunderstood mistakenly imply that clinical training confers an “intui- 
by the lay public and by clinical practitioners alike. > tive” ability to predict an individual case. Instead, EEE 
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Learning Objectives 


12.1 Summarize the reasons why psychology suffers from a negative image 


12.2 Explain why the interdisciplinary nature of psychology diminishes 
its scientific contributions 


12.3 Outline the problems within the field of psychology that 
contributes to its negative image 


12.4 Describe the issues that affect the field of psychology due to a 
growing ideological monoculture 


12.5 Differentiate individual psychology from scientific psychology 


12.6 Distinguish between scientific psychological research and pseudo- 
scientific claims 


12.7 Summarize how psychology uses all of the components of science 
to understand the nature of human behavior 


Rodney Dangerfield was a popular comedian for over three decades and whose 
trademark was the plaintive cry, “I don’t get no respect!” In a way, this is a fitting 
summary of psychology’s public image. This chapter will touch on some of the rea- 
sons that psychology appears to be the Rodney Dangerfield of the sciences. 
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judgments about the field and its accomplishments are resoundingly negative. Psy- 
chologists are aware of this image problem, but most feel that there is little they can 
do about it, so they simply ignore it. This is a mistake. Ignoring psychology’s image 
problem threatens to make it worse. 


Psychology’s Image Problem 


Some of the reasons for psychology’s image problem have already been discussed. For 
example, 
Senken? Trasse ae, 
nn. 3 (Overskeid, 2007). 


(Gaynor, 2004), ME, 
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Psychology and Parapsychology 


The layperson’s knowledge of reputable psychological research, outside of the work of 
Freud or Skinner, is virtually nonexistent. One way to confirm this fact is to look in your 
local bookstore to see what material on psychology is available to the general public. 
Inspection will reveal that the material generally falls into three categories. First, there 
will be a few classics (Freud, Skinner, Fromm, Erickson, etc.) heavily biased toward 
old-style psychoanalytic views that are totally unrepresentative of modern psychology. 
Frustratingly for psychologists, works of real worth in the field are often shelved in the 
science and/or biology sections of bookstores. For example, psychologist Steven Pinker’s 
well-known and esteemed book How the Mind Works (1997) is often in the science section 
rather than the psychology section. Thus, the important work in cognitive science that he 
discusses becomes associated with biology, neurophysiology, or computer science rather 
than psychology. For example, in my local Barnes & Noble store, the Science section has 
subsections labeled Biology, Chemistry, Earth, and Physics, of course. But it also has a 


subsection labeled er Science, and in it are shelved some of the very best recent 
books on psychological research: Kahneman’s (2011) Thinking, Fast and Slow; Wegner and 


Gray’s (2016) The Mind Club; Gilovich and Ross’s (2015) The Wisest One in the Room; and 
Tetlock and Gardner’s (2015) Superforecasting. None of these books are shelved in the 
Psychology section atmy Barnes & Noble and thus no one in the public will associate the 
first-rate psychological science in these volumes with the discipline of psychology itself. 
The second class of material found in most stores might be called pseudoscience 
masquerading as psychology—that is, the seemingly never-ending list of so-called 
paranormal phenomena such as telepathy, clairvoyance, psychokinesis, precognition, 
reincarnation, biorhythms, astral projection, pyramid power, and psychic surgery. 
The presence of a great body of this material in the psychology sections of bookstores 
undoubtedly contributes to the widespread misconception that psychologists are the 
people who have confirmed the existence of such phenomena. There is a bitter irony for 
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interest in modern psychology. The reason, however, is a surprise to many people. 
The statement that the study of ESP and other paranormal abilities is not accepted 
as part of the discipline of psychology will undoubtedly provoke the ire of many 
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believes in the existence of such phenomena and often holds these beliefs with consid- 
erable fervor (Poppy, 2017; Shermer, 2011). Like most religions, many of the so-called 
paranormal phenomena seem to promise things such as life after death, and for some 
people, they serve the same need for transcendence. It should not be surprising, then, 
that the bearer of the bad tidings that research in psychology does not validate ESP is 
usually not greeted with enthusiasm. 

The statement that psychology does not consider ESP a viable research area invari- 
ably upsets believers and often provokes charges that psychologists are dogmatic in 
banishing certain topics from their discipline. But this criticism is wrong. Scientists do 
not determine by edict which topics to investigate. No proclamation goes out declar- 
ing what can and cannot be studied. Areas of investigation arise and are expanded or 
terminated according to a natural selection process that operates on ideas and meth- 
ods. Those that lead to fruitful theories and empirical discoveries are taken up by a 
large number of scientists. Those that lead to theoretical dead ends or that do not yield 
replicable or interesting observations are dropped. 

The reason that ESP, for example, is not considered a viable topic in contempo- 
rary psychology is simply that its investigation has not proved fruitful. Therefore, 
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very few psychologists are interested in it. It is important here to emphasize the word 
“contemporary,” because the topic of ESP was of greater interest to psychologists 
some years ago, before the current bulk of negative evidence had accumulated. As 
history shows, research areas are not declared invalid by governing authorities; they 
are merely winnowed out in the competing environment of ideas. 

ESP was never declared an invalid topic in psychology. The evidence of this fact 
is clear and publicly available (Galak et al., 2012; Hand, 2014; Nickell & McGaha, 
2015). Many papers investigating ESP have appeared in legitimate psychological jour- 
nals over the years. As recently as 2011, a major APA journal published a paper on a 
parapsychological effect (Bem, 2011). Alas, as is so often the case, the effects reported 
appear not to be reliable or replicable (Galak et al., 2012; Wagenmakers et al., 2011). 

Parapsychologists who thrive on media exposure like to give the impression that 
the area is somehow new, thus implying that startling new discoveries are just around 
the corner. The truth is much less exciting. The study of ESP is actually as old as psy- 
chology itself. It is not a new area of investigation. It has been as well studied as many 
of the currently viable topics in the psychological literature. The results of the many 
studies that have appeared in legitimate psychological journals have been overwhelm- 
ingly negative. After more than 90 years of study, there still does not exist one example 
of an ESP phenomenon that is replicable under controlled conditions. In short, there is 
no demonstrated phenomenon that needs scientific explanation. For this reason alone, 
the topic is now of little interest to psychology. 

Psychologists have played a prominent role in attempts to assess claims of para- 
normal abilities. Many of the most important books on the state of the evidence on 
paranormal abilities have been written by psychologists. But given that this is the case, 
it is ironic that psychology, the discipline that has probably contributed most to the 
negative assessment of ESP claims, is the field that is most closely associated with such 
pseudosciences in the public mind. 


The Self-Help Literature 


The third category in the bookstore psychology section is the so-called self-help litera- 
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generally increasing feelings of self-worth and competence. Others attempt to pack- 
age familiar bromides about human behavior in new ways. A few (but all too few) are 
authored by responsible psychologists writing for the general public. Many that are 
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are usually designed not only to correct specific behavioral problems but also to help 
satisfy general human wants (making more money, losing more weight, and having 
better sex are the “big three”), thereby ensuring larger book sales. These so-called new 
therapies are rarely based on any type of controlled experimental investigation. They 
usually rest on personal experience, or on a few case histories if the author is a clini- 
cian. This is often true of the treatments of so-called alternative medicine. 

The many behavioral and cognitive therapies that have emerged after painstaking 
psychological investigation as having demonstrated effectiveness are usually poorly rep- 
resented on the bookshelves. Lilienfeld (2012) estimates that of the 3,500 self-help books 
that are published each year, only about 5 percent of them have any scientific validation. 

The situation is even worse in the electronic media and the internet. Radio and TV 
carry virtually no reports of legitimate psychology and instead present purveyors of 
bogus “therapies” and publicity-seeking media personalities who have no connection 
to the actual field of psychology. The main reason is that the legitimate psychological 
therapies do not claim to provide an instant cure or improvement, nor do they guaran- 
tee success or claim a vast generality for their effects (“Not only will you quit smoking, 
but every aspect of your life will improve!”). 
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It is similar in the case of the internet. The lack of peer review ensures that the 
therapies and cures that one finds there are often bogus. Here is one example. In 2008, 
Paul Offit published an important book titled Autism’s False Prophets, in which he 
detailed the many treatments for autism that have been found to be bogus by actual 
scientific research but that have enjoyed popularity among parents desperate for a 
treatment to help their children. One, facilitated communication, I have discussed in 
Chapter 6. Offit describes many other pseudoscientific treatments that have falsely 
raised parents’ hopes and have led them to spend thousands of dollars and to waste 
their time and energy chasing a bogus “cure.” On March 12, 2017, I identified one of 
the bogus chemical “cures” for autism discussed in Offit’s book (I will not name it in 
order not to add to its publicity) and typed it and the word “autism” into Google. Of 
the first ten links that appeared in the outcome of my search, four links were to web- 
sites that were advocating this bogus chemical remedy. 

Scientific accuracy is not guaranteed in a web search because websites are not peer 
reviewed. They thus provide no consumer protection for the random searcher with 
no further knowledge of the scientific literature on the topic in question. Advice given 
on television shows is little better (see Korownyk et al., 2014). Indeed, physicians are 
becoming more and more concerned about so-called cyberchondria (Peterson, 2012): 
people thinking that they are ill because of obsessive surfing on the web for negative 
symptoms. Indeed, the internet is full of bad medical advice—so much so that Google 
is working on search tools to remedy this problem (Mole, 2016). It is even more full of 
bad psychological advice. 
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|<. In fact, the idea of recipe knowledge provides one way of conceptualizing the 
difference between basic and applied research. The basic researcher seeks to uncover 
the fundamental principles of nature without necessarily worrying about whether 
they can be turned into recipe knowledge. The applied researcher is more interested in 
translating basic principles into a product that requires only recipe knowledge. 
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nn. Although a number of psychological researchers do 


work on turning basic behavioral principles into usable psychotherapeutic techniques, 
health-maintaining behavior programs, or models of efficient industrial organization, 
psychological research is largely basic research aimed at uncovering general facts and 
theories about behavior. 

In all sciences, and in psychology in particular, there is usually a gap between the 
ideas that are productive for scientists and those that can be packaged to sell to the 
public. For example, there is legitimate research on “the power of positive thinking” in 
psychology (Sharot, 2011), but it bears little resemblance to the self-help prescriptions to 
that effect that were heard on The Oprah Show. Instead, the real psychological research 
literature is full of caveats, concerns about converging evidence, and the search for con- 
nectivity across research methods—in short, all of the real research concerns discussed 
in this book. 

Consider the area of weight loss prescriptions. Scientists have slowly accumulated 
evidence for some mild prescriptions that help with weight control, but they are not 
breakthrough remedies. It is clear that the problem of obesity is complex and is subject 
to our warnings about multiple causation (Hewer, 2014; University of California, 2015b). 
The problem will clearly not have a single magic-bullet solution. Many scientists have 
stressed, for example, how the complexities of the food environment itself (advertising, 
portion sizes, marketing to children) contribute to the nation’s obesity problem. The 
media want quick answers to questions that are of “public interest,” whereas science 
produces slow answers to questions that are scientifically answerable—and all the ques- 
tions that the public finds interesting might not be answerable. 


Psychology and Other Disciplines 
EN, Many other 


allied disciplines, using a variety of different techniques and theoretical perspectives, 
also contribute to our knowledge. Many problems concerning behavior call for an 
interdisciplinary approach. However, a frustrating fact that most psychologists must 
live with is that when work on an interdisciplinary problem is publicized, the contri- 
butions of psychologists are often usurped by other fields. 
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was conducted decades ago under the aegis of the U.S. Surgeon General. Thus, it is 
not surprising that the American Medical Association (AMA) passed a resolution to 
reaffirm the survey’s findings of a suggested causal link and to bring the conclusions 
more publicity. Again, there is nothing wrong here, but an unintended consequence 
of the association of the findings on televised violence with the AMA is that it cre- 
ated the impression that the medical profession had conducted the scientific research 
summarized in the report. In fact, the overwhelming majority of the research studies 
on the effects of television violence on children’s behavior have been conducted by 
psychologists. It was similar, decades later, when the American Academy of Pediatrics 
issued a report recommending limiting the internet use and cell phone use of children 
(Peterson, 2013). It was psychologists, not pediatricians, who did most of the scientific 
work that established these recommendations. 

One of the reasons that the work of psychologists is often ascribed to other dis- 
ciplines is that the word psychologist has, over the years, become ambiguous. Many 
research psychologists commonly append their research specialty to the word 
psychologist when labeling themselves, calling themselves, for example, physiologi- 
cal psychologists, cognitive psychologists, industrial psychologists, evolutionary 
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psychologists, or neuropsychologists. Some use a label that does not contain a 
derivative of the word “psychology” at all, for example, neuroscientist, cognitive 
scientist, artificial intelligence specialist, and ethologist. Both of these practices— 
in conjunction with the media’s bias that “psychology isn’t a science”—lead to the 
misattribution of the accomplishments of psychologists: The work of physiological 
psychologists is attributed to biology, the work of cognitive psychologists is attrib- 
uted to computer science and neuroscience, the work of industrial psychologists is 


attributed to engineering and business, and soon. Laaa aS S 
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|. Author Michael Lewis (2017), who wrote a book on Kahneman’s work, 
admits that it is quite natural for a laypersontoask 222220 a 

Finally, psychology departments are often misunderstood even within their own 
universities. Susan Putnam, chair of the psychology program at Canisius College, 
described how she battled to get psychology classified as a science at her institution 


(Weir, 2015). 


Our Own Worst Enemies 


Lest it appear that we are blaming everyone else for psychology’s image problems, 
it is about time that we acknowledge the contribution of psychologists themselves to 


confusion about their field. 4200,00 
However, the focus of this 
section is on a different problem altogether; 0 


This attitude presents a serious threat 


to the integrity of psychotherapy. First, there is the proliferation of therapies that has 
occurred because of a reluctance to winnow out those that do not work. Such a prolif- 
eration not only removes a critical consumer protection but also promotes confusion in 


the field. Second, there is an inconsistency in a therapeutic community that, on the one 
hand, argues against scientific evaluation because it is “more art than science,” in the 
common phrase, but is still greatly concerned about what is the 800-pound gorilla in the 
room: reimbursement for services by government and private health insurers.  _ 


Some readers of the first few editions of this book commented that they thought 
I had “let psychologists get off too easily” by not emphasizing more strongly that 
unprofessional behavior and antiscientific attitudes among psychologists themselves 
contribute greatly to the discipline’s image problem. In trying to provide more balance 
here, I have relied heavily upon the work of Robyn Dawes (1994) and Scott Lilienfeld 
(2012; Lilienfeld et al., 2014). Dawes does not hesitate to air psychology’s dirty linen 
and, at the same time, to argue that the scientific attitude toward human problems 
that is at the heart of the true discipline of psychology is of great utility to society. For 
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pushed to defend licensure requirements as anything more than restraint of trade, how- 


ever, the organization uses its scientific credentials as a weapon (one president of the 
APA, defending the organization from attack, said “Our scientific base is what sets us 
apart from the social workers, the counselors, and the Gypsies”; Dawes, 1994, p. 21). 
But the very methods that the field holds up to justify its scientific status have revealed 


that the implication 
(Tracey et al., 2014). 


that licensed er have a unique “clinical ae is false 


Several categories of een have flourished in clinical psychology dur- 
ing the past few decades, including: unvalidated and bizarre treatments for trauma; 
demonstrably ineffective treatments for autism such as facilitated communication 
(see Chapter 6); the continued use of inadequately validated assessment instruments 
(e.g., many projective tests); and the use of highly suggestive therapeutic techniques 
to unearth memories of child abuse (Baker et al., 2009; Lilienfeld, 2007, 2013). 


nn, (Lilienfeld et al., 2014). The list is so long that we can only 
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and traumatic events such as bombings, shootings, combat, terrorism, and earthquakes 
(Foa et al., 2013; McNally et al., 2003). The debriefing procedure involves having the 
client “talk about the event and ventilate their emotions, especially in the company 
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of peers who have experienced the same incident” (McNally et al., 2003, p. 56), and 
its purpose is to reduce the incidence of posttraumatic stress disorders (PTSDs). The 
majority of debriefed clients report that the experience was helpful. Of course, no one 
who has read this book will find that evidence convincing (recall the discussion of 
placebo effects in Chapter 4). A control group (which is not given the critical-incident 
stress debriefing) is obviously needed. In fact, “the vast majority of trauma survivors 
recover from initial posttrauma reactions without professional help” (McNally et al., 
2003, p. 45), so it clearly needs to be demonstrated that the recovery rate is higher 
when the critical-incident stress debriefing is used. Properly controlled studies have 
shown that this is not the case (Foa et al., 2013; McNally et al., 2003), yet the procedure 
continues to be used. 

Emery et al. (2005), in a review of a large body of evidence, have shown that, 
likewise, 0) 
2...» (Novotney, 2008). For example, they describe several assessment instru- 
ments used by clinical psychologists purportedly to assess children’s best interests in 
these custody disputes. After reviewing several of these instruments—for example, 
scales purporting to assess the perception of relationships and parental awareness 
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skills—Emery et al. (2005) conclude that none of them have demonstrated reliability or 
validity. They note that “no study examining the properties of these measures has ever 
been published in a peer-reviewed journal—an essential criterion for science” (p. 8) 
and conclude that “our bottom-line evaluation of these measures is a harsh one: these 
measures assess ill-defined constructs, and they do so poorly, leaving no scientific jus- 
tification for their use in child custody evaluations” (p. 7). 

Things may be looking up, however. 


BEE (Baker et al., 2009, p. 67). This report received considerable publicity, but some 
of the discussion in the general media confused the issue as much as clarified it. An 
otherwise accurate report in Newsweek magazine was unfortunately titled “Ignoring 
the Evidence: Why Do Psychologists Reject Science?” (Begley, 2009). The title mistak- 
enly implies that it is all psychology that rejects science, rather than the problematic 
subfield of clinical psychology. This confusing title is bitterly ironic given that the logic 
of the APS report was that of all the rest of psychology—which does adhere to the 
scientific method—speaking in distress to just one of its many subfields that does not 
(clinical psychology). 
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the recovered-memory-false-memory debate of the last two decades (Lilienfeld, 2007; 
Loftus & Guyer, 2002; McHugh, 2008; Patihis et al., 2014). Many cases were reported 
of individuals who had claimed to remember instances of child abuse that had taken 

en 
induced by the therapy itself (Ammirati & Lilienfeld, 2015; Lilienfeld, 2007; Loftus & 
Guyer, 2002). In the emotionally charged atmosphere of such an explosive social issue, 


nsvcholoeists nrovided some of the more balanced commentary and. most imnortant. 
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some of the more dispassionate empirical evidence on the issue of recovered or false 
memories (Brainerd & Reyna, 2005; McNally & Geraerts, 2009; Moore & Zoellner, 2007; 
Patihis et al., 2014). 

Here we have the Jekyll and Hyde feature of psychology in full-blown form. Some 
of the cases of therapeutically induced false memories—and, hence, of the controver- 
sial phenomenon itself—were caused by incompetent and scientifically ignorant ther- 
apists who were psychologists. On the other hand, the resolution of the controversy 
we do have is in large part due to the painstaking efforts of research psychologists 
who studied the relevant phenomena empirically. Finally, 00000000000) 
hii. eee 
iin. ane 
(Kenney, 2008; Novella, 2015). 


In his book on research methods, psychologist Douglas Mook (2001) referred to my 
use of the Rodney Dangerfield joke to title of this chapter and commented that” . 


S S 
5” (p. 473). Lagree completely with this sentiment. 


Mook is right that the student of psychology needs to understand the paradoxes that 
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surround the discipline. As I have presented it in this book, as the science of haman 
behavior, the discipline of psychology often gets too little respect. But the face that 
psychology often presents to the public—that of a clinician claiming “unique” insight 
into people that is not grounded in research evidence—often gets too much respect. 
The discipline is often represented to the public by segments of psychology that do not 
respect its unique defining feature—that it validates statements about human behav- 
ior by employing the methods of science. 

There is another way though, that psychology might be said to be getting too 
much respect, and I will deal with this in the next section. It is an aspect of modern 
psychology that threatens the objectivity of the discipline. 


Our Own Worst Enemies, Part II: 
Psychology Has Become an Ideological 
Monoculture 


As I have mentioned, the discussion in the previous section was motivated by feed- 
back I have received from some readers who thought that this book was too positive 
about psychology. These readers of earlier editions of the book thought that I had “let 
psychology off the hook,” so to speak, because I did not say enough about the flaws 
within the discipline. The primary thing that these readers pointed me to were the 
anti-scientific attitudes within psychology itself—primarily within clinical psychology. 
The previous section, which has appeared in several recent editions, was my attempt 
to accommodate the feedback of these critics of earlier editions. 
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Indeed, in earlier chapters I have pointed out some flaws in psychology as a science 
that are generic to the discipline in that they span many of its subspecialty areas. For 
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Many of these criticisms have been problems within psychology for some time. 
However, the issue I will discuss in the remainder of this section is a problem that 
has intensified in the last couple of decades. It is a problem that is much more seri- 
ously impeding psychology in 2018 than it was in 1986, when the first edition of this 
book appeared. That problem is the ideological homogenization of psychology as a 


disciPHAS-always been true that llega 
2.2... Even 30 or 40 years ago, there were more liberal psychology professors 


than there were conservative ones—more Democrats than there were Republicans. 
But much converging research has shown that this imbalance has become even more 
marked in the last 20 years (Duarte et al., 2015)—so much so, that it would not be 
unfair to characterize the field of psychology as an ideological monoculture. Studies 
of social science departments in universities have indicated that 58-66% of professors 
identify themselves as liberals and just 5-8% as conservatives (Duarte et al., 2015). 
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The imbalance in psychology departments is even worse, with 84% of professors iden- 
tifying as liberal and just 8% as conservative. This imbalance has skyrocketed in recent 
years. In 1990, the ratio was 4 liberals for every 1 conservative in psychology depart- 
ments (the ratio is 1 liberal to 2 conservatives in the entire US population)—a strong 
imbalance, but still, the 20% of faculty who were conservatives at least provided some 
diversity. Butby the year 2000, the ratio had climbed to 6 liberals for every 1 conserva- 
tive (Duarte et al., 2015). And by the year 2012, the ratio had risen to an astonishing 
14 to 1—virtually an ideological monoculture. 


It is true that an ideological imbalance will not be a problem for many areas of 


psychology. nn es 


2.2... So we are not suggesting here that all areas of research in psychology 
have this problem, or even a majority of them. Nevertheless, = n 


For example, 


It should be clear why an ideological imbalance is a problem in areas of research 
like those I have just enumerated. In Chapter 2, I discussed the unique feature of sci- 
ence that allows it to overcome the myside bias of individual scientists. Recall from that 
discussion that I emphasized that AAA 
(that they are completely objective or that they are never 
biased) 
a 


However, 
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I described it in Chapter 2. Unfortunately, 
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science to objectively approach politically charged topics like those mentioned above. 

It would be a mistake for psychologists to think that there are easy ways around 
this homogeneity—for instance, that they could just try harder to be objective on an 
individual basis. This would amount to denying what I stated in Chapter 2: that sci- 
entists are not uniquely virtuous in their objectivity; instead, they are kept honest by 
the social process of science. An ideological monoculture will not keep psychology hon- 
est in this way, because it removes the social milieu of criticism and cross-checking. 
Ironically enough, there is a well-known psychological phenomenon that suggests 
that it would be all too tempting for psychologists to think that they do not have this 
problem of myside bias—that they can set aside their ideological preferences while 
doing their science. The phenomenon is called the bias blind spot, which is the label 
for the finding that it is relatively easy for people to recognize bias in the decisions of 
others, but it is difficult to detect bias in their own judgments (Pronin, 2007). It would 
be all too easy for psychologists to think (wrongly) that they are immune from the bias 
blind spot and that ideological homogeneity is not a problem for their field. 

There is an additional reason why it would be tempting for psychologists to 
wrongly assume that they have a unique ability to avoid bias. As the statistics presented 
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above show, the vast majority of psychologists are liberal Democrats. Liberal research 
psychologists have become accustomed, as we all have, to media presentations that 
are critical of conservative Republicans who do not accept the conclusions of climate 
science, or of evolutionary biology. These media presentations are correct, of course. 
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has a negative connotation, and rightly so. However, there is a trap lying in wait for 
liberal psychologists here. It isa very tempting step to say to oneself: Well, I get climate 
science right and Republicans get it wrong; and I get evolution right and conservative 
Republicans get it wrong; so therefore we liberal psychologists are getting everything 
right about psychology too (again, think of all the charged topics we mentioned above: 
parenting, sexuality, crime, poverty, etc.). 

In short, psychologists might say to themselves: “Well, we may all be Democrats 
with no political variability among us, but that doesn’t matter because the Republicans 
deny science and we are the party of science.” That is pretty much what the Democratic 
Party did years ago when it declared itself the “party of science” and labeled the 
Republican Party as the science deniers. That stance spawned a series of books with 
titles like The Republican War on Science (Mooney, 2005). This might have been a fine 
political strategy for the Democratic Party, but research psychologists should know 
better. They should be able to see the obvious selection effects operating here—namely, 
that the issues in question (climate science and creationism/evolution) are cherry- 
picked for reasons of politics and media interest. In order to correctly call one party the 
party of science and the other the party of science deniers, one would of course have 
to have a representative sampling of scientific issues to see whether members of one 
party are more likely to accept the scientific consensus. 

In fact, 
a. 
In fact, and ironically, there are enough examples to produce 
a book parallel to the Mooney volume cited above titled Science Left Behind: Feel- 
Good Fallacies and the Rise of the Anti-Scientific Left (Berezow & Campbell, 2012). We 


mentioned two of these in earlier chapters; 4 
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5013; Plomin ét åL, 2016); 


(Bertrand 


et al., ZULU; Dlack et al., 20006, CONSAD, ZUUI; Kolesnikova & Liu, ZUIL; O Neill & 
O'Neill, 2012; Solberg & Laughlin, 1995). 

These aren’t the only two issues, though ss (just 
as conservatives obfuscate the research on global warming) 0... 
(Chetty 
et al., 2014; McLanahan et al., 2013; Murray, 2012, aa 
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nn (Seidenberg, 2017); 
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I will stop here because the point is made. There is plenty of science denial on the 
liberal side to balance the anti-scientific attitudes of conservatives toward climate change 


and evolutionary theory. i) 
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be enough to sober psychologists and stop them from thinking that they are not subject 
to the bias blind spot that they themselves have discovered (Pronin, 2007). 

Duarte et al. (2015) provide several examples indicating that = 
ee. They 
discuss a study that attempted to link the conservative worldview with “the denial of 
environmental realities.” Subjects were presented with the following item: If things 
continue on their present course, we will soon experience a major environmental 
catastrophe. If the subject did not agree with this statement, they were scored as deny- 
ing environmental realities. But as Duarte et al. point out, the term “denial” implies 
that what is being denied is a descriptive fact. However, without a clear description of 
what “soon” means in this statement, or “major” means, or what “catastrophe” means, 
the statement itself is not a fact—and so labeling one set of respondents as deniers 
reflects little more than the ideological biases of the study’s authors. Other statements 
in the questionnaire have a similar logic. If the subjects did not agree with vague envi- 
ronmental values and instead affirmed statements like “the balance of nature is strong 
enough to cope with the impacts of modern industrial nations” then they were coded 
as denying environmental “realities.” 

Another study discussed by Duarte et al. tried to link aspects of the conservative 
temperament to making unethical decisions. One item was a very short and vague 
scenario involving one employee sending a sexist email to another employee (Felicity) 
after the two had had a work disagreement. The subject was asked to take the position 
of a manager not involved in the incident and decide whether the manager should 
write a letter supporting Felicity in her sexual harassment complaint. The authors 
of the study coded as ethical behavior the manager immediately writing the letter. 
Anything less than that choice on the part of the subject was coded as less ethical, 
or even unethical. Because so little information was given about the case, this item 
measured little more than the subject’s pre-existing bias toward one party or the other 
in a sexual harassment case, yet the study was billed as an examination of unethical 
decision-making. Just like the environmental study in the previous paragraph, this 


research was displaying the tendency to mark legitimate policy differences as abso- 
lutely right (or ethical) or absolutely wrong (or unethical)—with a bias toward identi- 
fying the “right” response as the liberal one! 

This tendency to conflate liberal responses with the right response (or ethical 
response, or fair response, or open-minded response) is particularly prevalent in the 
subareas of social psychology and personality psychology. It often takes the form of 
labeling any legitimate policy difference with liberalism as some kind of intellectual 
or personality defect (dogmatism or authoritarianism or racism or prejudice). This has 
been true for many years in the study of racism by social psychologists. Many of the 
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Someone with legitimate policy differences with affirmative action or busing—or 
someone indicating that they are concerned about crime—will almost always be 
scored in the racist direction on these scales (Snyderman & Tetlock, 1986; Tetlock, 
1994). In such studies, the overt ideological bias of psychology is blatantly obvious to 
any neutral observer. The purpose of such studies seems, transparently, to be to label 
anyone that does not adhere to liberal orthodoxy as a racist. 

In fact, there is a whole subspecialty area in social psychology devoted to 
showing that negative traits such as prejudice and stereotyping and unfairness are 
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associated with the conservative temperament. There is even a theory—the “intrinsic 
thesis”—that hypothesizes that the increasing political polarization surrounding 
scientific issues is due to the “psychological deficiencies among conservatives as 
compared to liberals” (p. 36, Nisbet et al., 2015). Recently, there have been a flurry 
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that many of these studies have not been replicable, were poorly designed, or were 
designed and interpreted in a biased manner (Brandt et al., 2014; Chambers et al., 
2013; Crawford, 2012; Duarte et al., 2015; Jussim et al., 2016; Kahan, 2013; Nisbet 
et al., 2015; Oswald et al., 2013). 

And then there are the legion of studies that are overly hyped or misleadingly 
hyped because they seem to support a liberal conclusion. The classic example is the 
work on stereotype threat (see Jussim et al., 2016), which has been reported incor- 
rectly in many media outlets and psychology textbooks. The actual finding was that 
introducing a stereotype threat increased the test score difference between African- 
American and white college students (Jussim et al., 2016). Because the original authors 
used a confusing statistical reporting procedure, textbooks often report (incorrectly) 
that the study found that racial group test score differences are eliminated when ste- 
reotype threat is removed. That is not at all the finding, but it is the one that was 
propagated widely because of psychology’s ideological monoculture. 

The examples could continue (see Duarte et al., 2015; Jussim et al., 2016), but I will 
stop here to stress that all of this does psychology no good. This ideological bias in 
the discipline is becoming more discernible to the general public. Indeed the psychol- 
ogy monoculture has made true the old joke: Psychology departments exist so that 
Democrats can say “studies show.” More seriously for the discipline, it will certainly 
be true that funding agencies will become more aware of the ideological bias, as will 
the state legislatures funding the individual departments at their state universities. 
None of this will be good for psychology. 

Our ideological monoculture has prevented us from challenging the host of quasi- 
psychological concepts that have been used in the last decade to suppress free speech 
on university campuses (Lukianoff & Haidt, 2015). Most of these concepts—concepts 
such as safe spaces, trigger warnings, rape culture, or micro-aggressions—have no 
empirical or theoretical grounding in psychological research. Yet psychologists have 
not been prominent in explaining to university students and administrators that these 
concepts are not grounded in psychological science. Indeed, many psychologists have 
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been high-level administrators in some of the universities that are proliferating these 
concepts like mental viruses. A notable exception is Scott Lilienfeld’s (2017) thorough 


explication of what it would take, research-wise, to properly ground the concept of 
micro-aggression—to change it from its present status as a mere political weapon 
into a behavioral science concept. Not surprisingly for those of you who have read 


Chapter 6 (Clever Hans, etc.), he recommends changing the term to something more 
neutral that does not carry so much theory with it. 
Finally, psychology’s image is not helped by the tendency of one of its orga- 


nizations, the American Psychological Association, to go way beyond science into 


sarah MELB dS REP His ACH States ss Che othe Me aC BRYs 
this tendency). In a perceptive article on psychology’s image problem, Ferguson 
(2015) discusses how the APA’s public policy statements have repeatedly strayed 
into politics and have reinforced the public view that the organization is not 
scientific but an advocacy organization (politically, for liberal and Democratic posi- 


tions). Ferguson discusses APA policy statements on abortion and welfare reform as 
particularly problematic—more politics than science. The ideological monoculture 


in university psychology departments is thus mirrored by the most publicly visible 
organization representing psychologists, the APA. 
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Isn’t Everyone a Psychologist? 
Implicit Theories of Behavior 


We all have theories about human behavior. It is hard to see how we could get through 
life if we did not. In this sense, we are all psychologists. It is very important, though, to 
distinguish between this individual psychology and the type of knowledge produced 
by the science of psychology. The distinction is critical because the two are often delib- 
erately confused in popular writings about psychology, as we shall see. 

Much of our personal psychological knowledge is recipe knowledge. We do 
certain things because we think they will lead others to behave in a certain way. We 
behave in particular ways because we think that certain behavior will help us achieve 
our goals. But it is not the mere presence of recipe knowledge that distinguishes per- 
sonal psychology from scientific psychology (which also contains recipe knowledge). 
The main difference here is that the science of psychology seeks to validate its recipe 
knowledge. Scientific evaluation is systematic and controlled in ways that individual 
validation procedures can never be. 

In addition, science always aspires to be more than recipe knowledge of the 
natural world. Scientists seek more general, underlying principles that explain why 
the recipes work. Rather than being coherently constructed, many people’s personal 
psychological theories are merely a mixture of platitudes and clichés, often mutually 
contradictory, that are used on the appropriate occasion. They reassure people that 
an explanation does exist and, furthermore, that the danger of a seriously contradic- 
tory event—one that would deeply shake the foundations of a person’s beliefs—is 
unlikely to occur. As discussed in Chapter 2, although these theories may indeed be 
comforting, comfort is all that theories constructed in this way provide. In explaining 
everything post hoc, these theories predict nothing. By making no predictions, they 
tell us nothing. Theories in the discipline of psychology must meet the falsifiability 
criterion, and in doing so, they depart from the personal psychological theories of 
many laypeople. Theories in psychology can be proved wrong, and, therefore, they 
contain a mechanism for growth and advancement that is missing from many per- 


sonal theories. 


The Source of Resistance to Scientific 


Psychology 


For the reasons we just discussed, it is important not to confuse the idea of a personal 
psychological theory with the knowledge generated by the science of psychology. 
Such confusion is often deliberately fostered to undermine the status of psychology in 
the public mind. The idea that “everyone’s a psychologist” is true if it is understood 
to mean simply that we all have implicit psychological theories. But it is often subtly 
distorted to imply that psychology is not a science. 

We discussed in Chapter 1 why the idea of a scientific psychology is threatening 
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ural that individuals who have long served as commentators on human psychology 
and behavior will resist any threatened reduction in their authoritative role. Chapter 1 
described how the advance of science has continually usurped the authority of other 
groups to make claims about the nature of the world. The movement of the planets, 
the nature of matter, and the causes of disease were all once the provinces of theo- 
logians, philosophers, and generalist writers. Astronomy, physics, medicine, genet- 
ics, and other sciences have gradually wrested these topics away and placed them 
squarely within the domain of the scientific specialist. 
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The issue, then, is the changing criteria of belief evaluation. Few newspaper editori- 
als ever come out with strong stands on the composition of the rings of Saturn. Why? No 
censor would prevent such an editorial. Clearly the reason it is not written is that it would 
be futile. Society knows that scientists, not editorial writers, determine such things. Some 
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when these opinions contradict the facts. Of course, the correct term here is really not 
“right,” because, obviously, in a free society, everyone has the right to voice opinions, 
regardless of their accuracy. It is important to understand that what many people want is 
much more than simply the right to declare their opinions about human behavior. What 
they really want is the conditions that are necessary for what they say to be believed. When they 
make a statement about human psychology, they want the environment to be conducive 
to the acceptance of their beliefs. This is the reason that there are always proponents of the 
“anything-goes” view of psychology; that is, the idea that psychological claims cannot be 
decided by empirical means and are simply a matter of opinion. But science is always a 
threat to the “anything-goes” view, because it has a set of strict requirements for deter- 
mining whether a knowledge claim is to be believed. Anything does not go in science. 
This ability to rule out false theories and facts accounts for scientific progress. 

In short, a lot of the resistance to scientific psychology is due to what might be 
termed “conflict of interest.” As discussed in earlier chapters, many pseudosciences 
are multimillion-dollar industries that thrive on the fact that the public is unaware that 
statements about behavior can be empirically tested. The public is also unaware that 
many of the claims that are the basis of these industries (such as astrological prediction, 
subliminal weight loss, biorhythms, facilitated communication, and psychic surgery) 
have been tested and found to be false. Unproven medical remedies end up costing the 
public more than is spent on legitimate medical research (Mielczarek & Engler, 2012). 

How do we recognize pseudoscientific claims? Clinical psychologist Scott 
Lilienfeld (2005, p. 40) gives us a list of things to watch for that could serve as a sum- 
mary of many of the things that have been covered in this book. Pseudoscientific 
claims tend to be characterized by: 


° Ajtendemcy to invoke ad hoc hypotheses as a means of immunizing claims from 


e An emphasis on confirmation rather than refutation 


e A tendency to place the burden of proof on skeptics, not proponents, of claims 
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e Excessive reliance on anecdotal and testimonial evidence to substantiate claims 
e Evasion of the scrutiny afforded by peer review 


e Failure to build on existing scientific knowledge (lack of connectivity) 


True scientists are at pains to emphasize these criteria rather than to avoid them. In 
response, the pseudoscience industry continues to oppose the authority of scientific psy- 
chology to evaluate behavioral claims. However, the purveyors of pseudoscience often do 
not need to do direct battle with psychology. They simply do an end run around psychol- 
ogy and go straight to the media with their claims. The media make it very easy for cranks, 
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to produce their bibliographies of scientific research. If these guests are “interesting,” they 
are simply put on the show. And the internet is no better. Anyone can put up a website 
claiming—and selling—anything. Websites are not peer reviewed, to say the least! 

Science, then, does rule out knowledge claims that do not meet the necessary 
tests. The courts rule out claims of knowledge too. In ruling on a famous case known 
as Daubert vs. Merrell Dow, the Supreme Court established when expert testimony 
could be presented in court—that is, what makes expert testimony expert! The Court 
identified four factors that judges should consider when deliberating about whether 
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to allow expert testimony: (a) The “testability” of the theoretical basis for the opinion; 
(b) The error rates associated with the approach, if known; (c) Whether the technique 
or approach on which the opinion is based has been subjected to peer review; and 
(d) Whether the technique or approach is generally accepted in the relevant scientific 
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subjected to peer review; and (d) scientific knowledge based on convergence and con- 
sensus. The courts, like science, have ruled out claims of special knowledge, intuition, 
and testimonials as adequate evidence. 

In this book, we have briefly touched on what are considered adequate and inad- 
equate tests in science. Introspection, personal experience, and testimonials are all 
considered inadequate tests of claims about the nature of human behavior. Thus, it 
should not be surprising that conflict arises because these are precisely the types of 
evidence that nonpsychologist commentators have been using to support their state- 
ments about human behavior since long before a discipline of psychology existed. 

However, it should not be thought that I am recommending a sour, spoilsport 
role for the science of psychology. Quite the contrary. The actual findings of legitimate 
psychology are vastly more interesting and exciting than the repetitious gee-whiz 
pseudoscience of the media. Furthermore, it should not be thought that scientists are 
against fantasy and imagination. However, we want fancy and fantasy when we go 
to the movies or the theater—not when we go to the doctor’s office, buy insurance, 
register our children for child care, fly in an airplane, or have our car serviced. We 
could add to this list: going to a psychotherapist, having our learning-disabled child 
tested by a school psychologist, or taking a friend to suicide-prevention counseling at 
the university psychology clinic. Psychology, like other sciences, must remove fantasy, 
unfounded opinion, “common sense,” commercial advertising claims, the advice of 
gurus, testimonials, and wishful thinking from its search for the truth. 

It is difficult for a science to have to tell parts of society that their thoughts and 
opinions are needed—but not here. Psychology is the latest of the sciences to be in this 
delicate position. The difference in time period for psychology, however, is relevant. 
Most sciences came of age during periods of elite control of the structures of soci- 
ety, when the opinion of the ordinary person made no difference. Psychology, on the 
other hand, is emerging in a media age of democracy and ignores public opinion at its 
own peril. Many psychologists are now taking greater pains to remedy the discipline’s 
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communication role, the conflicts with those who confuse a personal psychology with 
scientific psychology are bound to increase. 

Not everyone is a physicist, even though we all hold intuitive physical theories. 
But in giving up the claim that our personal physical theories must usurp scientific 
physics, we make way for a true science of the physical universe whose theories, 
because science is public, will be available to us all. Likewise, everyone is not a psy- 
chologist. But the facts and theories uncovered by the science of psychology are 
available to be put to practical ends and to enrich the understanding of all of us. 


The Final Word 


We are now at the end of our sketch of how to think straight about psychology. It is 
a rough sketch, but it can be of considerable help in comprehending how the disci- 
pline of psychology works and in evaluating new psychological claims. Our sketch 
has revealed the following: 


1. Psychology progresses by investigating solvable empirical problems. This prog- 
ress is uneven because psychology is composed of many different subareas, and 
the problems in some areas are more difficult than in others. 


The Rodney Dangerfield of the Sciences 


2. Psychologists propose falsifiable theories to explain the findings that they 
uncover. 

3. The concepts in the theories are operationally defined, and these definitions 
evolve as evidence accumulates. 

4. These theories are tested by means of systematic empiricism, and the data 
obtained are in the public domain, in the sense that they are presented in a man- 
ner that allows replication and criticism by other scientists. 

5. The data and theories of psychologists are in the public domain only after publi- 
cation in peer-reviewed scientific journals. 

6. What makes empiricism systematic is that it strives for the logic of control and 
manipulation that characterizes a true experiment. 

7. Psychologists use many different methods to arrive at their conclusions, and 
the strengths and weaknesses of these methods vary. 


8. The behavioral principles that are eventually uncovered are almost always 
probabilistic relationships. 
9. Most often, knowledge is acquired only after a slow accumulation of data from 


many experiments each containing flaws but nonetheless converging on a 
common conclusion. 
The most exciting endeavor in science today is the quest to understand the nature 
of human behavior. By learning the concepts in this book you become able to follow 
this quest and perhaps, indeed, become a part of it! 
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